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THE. APPROXIMATE DISTRIBUTIONS OF THE MEAN AND, 
VARIANCE OF A SAMPLE OF INDEPENDENT VARIABLES 

By P. L. Hsu 

The National University of Peking 

1. Introduction. In this paper we shall study the mean and variance of a 
large number, n (a sample of size n) of mutually independent random variables: 

(1) ••• ,fn, 

having the same probability distribution represented by a (cumulative) distribu¬ 
tion function P(x). The rth moment, absolute moment, and semi-invariant of 
P(x) are denoted by ar, fin and 7 r respectively. It is assumed that for a certain 
integer k > Zy fik < » and that a2 > 0. Hence there is no loss of generality in 
assuming that 

(2) ai = 0, a2 = 1. 

The characteristic function corresponding to P(x) is denoted by p(t). 

We put 

(3) 

n r-i n f-i 

(4) Fix) = PriVni < x), (?(x) = Pr ^ x). 

( V 04 — 1 ) 

The definition of G(x) implies that 04 < <» and 04 — 1 > 0. The case 04 — 1 = 0 
provides an easy degenerated case which wiU be treated separately (section 4). 
Cramer’s theorem of asymptotic expansion^ reads as follows: 

Theorem 1. If P{x) is non-singular and if fik < for some integer A; > 3, 
then 

(5) Fix) = ^ix) + Mx) + Rix) 
where 

( 6 ) 

^ (a:) is a certain linear combination of successive derivatives 

with each coefficient of the form times a quantity depending only on 

k, 08 ,••• , a*_i (1 < V < A; — 3) and 

(7) I Rix) I < 0/n‘'*-=> 

where Q is a constant depending only on k and P(x). 

^ H. CRAM]fiR: Random Variables and Probability DistribtUions (1937), Ch. 7. This book 
will be referred to as (C). /'f - ' . 4 
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In particular, putting A; = 3 we get that | F{x) — ^{x) | < provided 
P{x) is non-singular and ft < «. If the condition of non-singularity of 
P{x) be removed, then liapounoff’s theorem* furnishes the weaker result: 
I F{x) — #(x) I < Aftn~* log n where A is a numerical constant. 

Very recently Berry® succeeded in removing the factor log n from Liapounoff’s 
theorem under no other condition than that pz < We state here Berry’s 
theorem: 

Theorem 2. 7/ ft < « , then 
( 8 ) 

where A is a numerical constant. 

An essential step in the proof of these results is the selection of a weighting 
fxmction w(x) and the appraisal of the integral 

(9) f w(u)[F(u + x) — #(u + x) — + x)] du 

J-oo 

^ 0 when A; = 3). In his book^ Cramer proves Theorem 1 by taking w(v) = 
(—when < 0 and w(u) = 0 when 

(10) u>0 (0 < 0 ) < 1) 

and proves Liapounoff’s theorem by taking 

(ID »(.) - ^ 

On the other hand, Berry uses the following weighting function in his proof of 
Theorem 2: 

(12) w(m)--. 

The unfortunate selection of the function (11) accounts for the presence of the 
factor log n in Liapounoff’s theorem. 

Now Cramer’s proof of Theorem 1, based on the integral (9) with w(u) defined 
in (10), makes use of a result on that integral due to M. Riesz. A more ele¬ 
mentary proof than this can be devised. In fact, one has only to use, with 
Berry, the function (12) and to adopt his elementary appraisal^ of the integral 

* (C), Ch. 7. 

* A. C. Berry: *The accuracy of the Gaussian approximation to the sum of independent 
variates.” Trans. Amer. Math. Soc., Vol. 49 (1941), pp. 122-136. This paper will be re¬ 
ferred to as (B). 

* Berry proves the inequality (in our notation): 

dt 


1 — cos Tx 


{F(a; + fl) ♦(aj'H- cO] dx < f 
Jo 


1/(0 - I 
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(9) in order to obtain the proof of Theorem 1. One of our purposes is therefore 
to give an elementary proof of Theorem 1, without reference to the above- 
mentioned result due to M. Riesz. Section 2 is devoted to this work. 

We ought to add that Cramer's theorem and Berry’s theorem correspond to 
Theorems 1 and 2 for the case in which the random variables (1) do not follow 
the same distribution. The proof given in Section 2 is adaptable to these more 
general theorems when subjected to appropriate modifications; the assumption 
of a common distribution function for (1) is only made for the sake of con¬ 
venience. 

So much for the known results for the approximate distribution of f. By a 
purely formal operational method Cornish and Fisher® obtain terms of successive 
approximation to the distribution function of any random variable X with the 
help of its semi-invariants. It is hardly necessary to emphasize the importance 
of turning Cornish and Fisher’s formal result (asymptotic expansion without 
appraisal of the remainder) into a mathematical theorem of asymptotic expan¬ 
sion which gives the order of magnitude of the remainder. In this paper we 
achieve this for the simplest function of (1) next to f, viz. the i? in (3). We do 
not seek to remove the assumption of a common distribution for (1), as there 
will be no practical significance (e.g. in statistics) of rj if the variables (1) do not 
have the same probability distribution. Section 3 is devoted to the proof of 
the following theorems: 

Theorem 3. // ae < « o>nd 04 — 1 — al 5 *^ 0 (it cannot he negative) j then 

(. 3 ) 

where A is a numerical constant. 

Theorem 4. luct P{x) he non-singular and let a 2 k < « for some integer k > 3. 
Then 

(14) G{x) = ^{x) -I- xi^) + 

where ^(x) is the function (6), x(^) «linear combination of the derivatives ^'{x)^ 

• • • , with each coefficient of the form times a quantity depending only 

on k and a* , a4, * * • , a 2 jfe- 2 , and 


(B), p. 128. The “appraisal’^ mentioned here refers to (60) which is contained in B, p. 128. 
But Berry’s appraisal of the integral in the right-hand side of the above inequality is in 
default. He writes 


5 r (t^ 




e)t^ + 






(B, p. 132, line 3) whilst the last integral ought to be 



.1 — 4- c — dt. 


* £. A. Cornish and R. A. Fisher: '^Moments and cumulants in the specification of dis¬ 
tributions.” (Revue de I’lnstitut International de Statistique (1937), pp. 1-14.) 
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(15) I Ri{x) I < if A: « 4, 5 or 6 

71 

(16) iffc>7 

where Qk and Qh are constants depending only on k and P(x). 

It may be noticed that Theorem 3 is a “Berryifga'' theorem about 0(x)j its 
characteristic feature being the absence of any condition on the distribution 
fimction except the two on its moments, and that Theorem 4 is a “Cramerian” 
theorem about Gix), the characteristic feature being the assumption of non¬ 
singularity of P(x) besides that a 2 k < o®. 

In proving these theorems we have devised a method which is applicable to 
getting similar results about functions other than ri, such as functions com¬ 
monly used in applied statistics: the higher moments about the means, the 
moment ratios (e.g. K. Pearson’s hi and 62 ), the covariance, the coefficient of 
correlation, and “Student’s” ^-statistic. Works on such functions are being 
done by my university colleagues, and the results will be published shortly. 

If f is any of the random variables ( 1 ), then 

0 ^ ^la{^ 1) + = o^(a4 — 1) “t“ 2uba8 h^ 

for all real (a, h ). Hence 04 — 1 — > 0, and a 4 — 1 •“ a? == 0 means that 

there is unit probability that { assumes exactly two values. This easily degene¬ 
rated case is first eliminated in Theorem 3 by the assumption — 1 -- al 9 ^ 0 
and then considered in section 4. In Theorem 4 the condition a 4 — 1 — 0 

is implied since { cannot be a random variable of the nature just described owing 
to the non-singularity of P{x), 

2. Lemmas. Throughout this paper A, By Cj etc. ^vill denote positive numeri¬ 
cal constants; Ak , Bk (Ahm , Bkm)y etc., will denote positive constants depending 
only on some integer k (integers k and w), and Qk {Qkm) will denote a positive 
constant depending only on k {k and m) and the distribution function P{x), 
t^, 0, 0jb, (0ifcm), Ajb (Ajfcm) will denote respectively quantities such that 1| < 1, 
|0| < Ay |0fc| < Ak (|0*m| < Akm)y | Afc | < Qk (| Afc,„ | < QfcJ. These 
symbols do not necessarily stand for the same quantity at each occurrence. 
Thus 2i^ = 0, kBk = Ok etc. In particular any positive functions of A:, aa, • • • yock 
is a Qib. 


1.1. Cram6r obtains the as 3 miptotic expansion of the characteristic function 
of the distribution of Vnf, viz. when (1) do not have the same distribu¬ 

tion, valid for I ^ I < Qkn^'^- Since we assume a common distribution for (1), 


so that the characteristic function is 


asymptotic expansion valid for | ^ | 


we are able to derive an 


< Qk\^n. The extension to 
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, presents no difficulty. This is done in the following three lemmas, 


of which Lemma 3 contains the final result. 
Lemma 1. 


(17) 




Phoof: Since p(<) = 1 + X = 1 + q{t) say, we have, for 


< 1, 


k\ 


3(0 < t < t < f;} = e - 2 < I. 


Hence 

(18) 


log p(o = 22 


(_!)/«iiWI' + ej 3(0 |!w+«i. 


For 1 < i < [\{k — 1)] let us expand each to get a polynomial 

qjii) of degree A; — 1 and a remainder r,(0. In doing this we regard q{t) formally 
as a polynomial of degree k in t. For this polynomial we have the maj orating 
relation 

5(0 « 

whence 

{9(0}' « 

J 

which gives 

<19) I rXO i < t 1 1‘ < *4tj3t| t \\ 

T'^k • i 


Similarly, 

(20) I 5(0 1 <Au0,\t\ \ 

From (18), (19), (20) we obtain 

{21) log p(0 = Zi qjiO + 0fc/?iti < 1*. 

Since the sum in (21) must equal the sum in (17), the Lemma is proved. 

Lemma 2. Let (fi, f 2 , • * • , f m) he a random point with €({•<) = 0 and 
«(| fi I *') == |8fc< < 00 for some integer k > d {i — 1, - • , m). Let p{ti j • * • , tm) 
be the characteristic function. Then for | <.• | < (i = 1, • • • , m) 

we have 


n 



inX ^ ^ i'Ur , e,Vk 

' Vn/ ^ r! n*‘*“*' 


( 22 ) 
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where Ur and V, are the rth semi-invariant and the absolute moment respectively of 

Proof: li \U\ < then | U < 

I U I) < A/n- Since p ‘ > ^7n) value at ^ ~ of 

the characteristic function of ,*, it follows from Lemma 1 that for y/n > 
VH^ we have (22). 

Lemma 3. Lei (fi , • • • , fm) he a random point with e(ft) = 0, €({*<) = 1 and 
«(! ft* 1*) = Phi < «> for some integer fc > 3. Let p,y = €(ft{',)(p« = 1; f, j = 

• • • , m) and the matrix || p*,-1| be positive definite. Let 

m 

II —i S 

(23) A = det. 1 pu |, ¥>(<i, • • •, 4.) = c 

Let p{ti, • • • , tm) be the characteristic function. Then there exists a Bkm such that 
for I <,■ I < — (i = 1, ■ ■ • , m) we have 

&ki 

I ■ ■ ■ > ~ ’ ■ ■ ■ > 1 > •••>*01 

(24) 

where ^ {it \, • • • , itn) is a polynomial each of whose terms has the form 
^2 ' • • • {itm) ”* t 

Th 

with l<p<k — 3, S<vi+-^- + Vm< 3{k — 3), and a,y..p^ depending only 
on k and the moments 3 < pi + • • • + Pm < A: — 1. If k — 3, 

then ^ = 0. 

Proof. If | <,• | < i8jfc?^*A y/n, then | f,-1 < y/n since 

A < 1 and Pki > 1. It follows from Lemma 2 and the fact U 2 = ^PnUti that 


(26) 

f (^ 5 - • 

’ Vn) 

II 

;§^ 

r 

•S_^ 

<» 




where 




(26) 


8 = 

« i^Ur^ ^ e*Ft 

Vn h (r + 3) In"* ■*" ’ 
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Regarding s formally as a poljmomial in let us expand each (1 < 

i < A; — 3) to get a polynomial «, of degree A; — 3 in and a remainder r,*. 
For the formal polynomial s we have the majorating relation 


whence 


which gives 


\/n r -0 \/n^ ’ 


1 yZHh 

j ! 


f Ol* < 

n'« vln'/* - n»«-« 


Since * < 1 as shown in the proof of Lemma 2, we have 

A y(k~i+i})/k Akm(^ 


rA(k-2) - 




,1/fcI. |\*-*+*/ 




Since 0H > 1 we have Hence 


Similarly 


j V o^(k-2)lk I . |3(*-2) 
jJfc-2 Akm Z-r P** U 

1^1 < • 

(k - 2)1- 


From (25), (28), (29) we get 

f ^)}' - . •■.<.)(■+1 -+ 1 :>,+ 

= ip(tl , * • •, ^m) {1 + ^(l<l , • • * , itm) } 

+ ^ p9!!‘-'‘(|(.p + ii.-r + ■•• + ii.i“;-”)i»(!., •■•. u«“' 

where ^(tii , * • * , ttm) stands for Ssy. The assertion about ^(tii , • • • , itm) 
announced in the lemma can now be seen without difficulty. It remains to show 
that with suitable Bkm in the lemma, we have 

, , -A/4m«-l 2) <! 

^(<1, •••.Ue'*'<e 
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i«e. 

(30) 

From (27) we have 


§ Si + 1*1 ^ S • 


4m’" 


(31) 




Hk 




If we choose Bkm < (4m”*""^Ajfe«)“^ (and Bkm < in order that the earlier 

results may not be affected), the A km here coinciding with the last written Akm 
in (31), we have, for | f, | < Bkm0ki^^A\/n, 

On the other hand, if Xi, X 2 , • • • , Xm are the latent roots of || p*/1| then each 
\i < m since their sum is m. Letting Xi be the smallest one we have 


(33) 




2m*" 


(32) and (33) imply (30). Hence the lemma is proved. 

Let us write down the particular cases m = 1 and m = 2 of (24): 


(34) 




(35) 


„»(*-» 


n»(*- 


= e-*‘’(l + 4>m 


+ \t 


/ / <1 «* VM 

h {§ + I +--- + \k 

(l<^i< — p = ‘(nf*))- 


More specially let us rewrite (34) and (35) with = 3: 

/ / <1 t» -»(«f+<l+S.(,J,) 

{^K^nWn)] 


(37) 
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In this paper only these last four formulae are needed; they are used in the 
proofs of Theorems 2, 1, 3, 4 respectively. Cases of m > 2 of (24) will be 
needed for the works on other functions alluded to in the introduction. 

1.2. In the following group of lemmas, which culminate in Lemma 7, one 
finds a generalisation of the Riemann-Lebesgue theorem, vis. Lemma 6. 

Lemma 4, Let f(x) he a 'polynomial of degree m > 0, 'with real coefficients: 


(38) /(*) = i: ovx"-* (0,^0) 

Then 

(38) f* dx 

Jo 


Proof: It is sufficient to prove the inequality for 



Divide 


the interval into Am sub-intervals in each of whose interior none of the deriva¬ 
tives (t = 1, • • • , m) vanishes. It is sufficient to consider one of these 
sub-intervals, say (a, 6). Consequently each of the polynomials f*^(x) are 
monotonic in (a, h). Let 


(39) 


I * 


f 

•'a 


coaf(x) dx. 


Suppose first that f'(x) is positive and increasing for o < a: < h. Then 

I r I ^ . j. 1 /* /'(*) COB fix) dx 
fix) 


* + Ico8/(a!) dx (a + t ^bi ^b), 


by the second mean-value theorem. Hence 


(40) 


Ul<€ + 


2 

f'ia -t- 6) * 


NowO </'(a + Je) = /'(a + 0 - €/"(a -f- ^«)/2, i < d < 1. Hence/'(a + 
€) > ie/"(a + Be), Since/"(x) \s monotonic, we have either/'(a + €) > 

(a + €) or/'(a + e) > i^"(a + Jc). In other words, there exists a constant Co , 
independent of a or €, such that i < C* < 1 and/'(a + c) > j€/(a + Cto). 

If /'"(®) > 0, we have, as before /"(a + Cot) > iCi^'"(a + Cit)^ where Co 
is independent of a or € and i < C* < 1. If /"'(a:) < 0, then, since 
0 < /"(a + 2Cot) « /"(a + Cot) + Cotr{a + SiCot), i < 1, we have 
/"(a + Coe) > — Cj€/'"(a ■+• 2^iCa€). As /'"(x) is monotonic, either /"(a -|- 
Cot) > -Cotria -f Cot) or/"(a + Cot) > -Coif'"(a -f 2C7,e). In all cases 
we obtain /"(a + Cot) > Bot | /'"(a + Cot) |, where Bo and Co are independent 
of a or €, and J < Cs < 2. Hence f(a + t) > iBot | /'"(o + Cot) |. Arguing 
with =fcf'"(a + Cot) as we did with/"(a + Cot), and so on until we come 
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we obtain/'(a + e) > + Cm^) | = | Oo |. Substituting 

in (40) and putting c = | oo we obtain | / | < ilm | oo . The proof 
presupposes that Cm€ < 6 — a. If the reverse inequality is true, then 1 / | < 
6 — a < Cm 1 Oo Hence the lemma is true for f{x) positive and increas¬ 
ing in (a, h). 


r a 

cos (—/(6 — y)) dy, 

—/(6 — y) being a polynomial with the leading coefficient ioo and the first 
derivative/'(6 — y), which is positive and increasing. This case reduces there¬ 
fore to the preceding one. Finally, if /'(a*) is negative, we have only to notice 

that 1=1 cos (’-fix)) dx. Hence the lemma is proved. 

•'o 


Lemma 6. Letf(x) he the polynomial (38a), and let ar 0 for some r ,0 < r < m. 
Then 


Proof: We may assume that | Or | > 1, (41) being trivial if | Ur | < 1. If 
r = 0 this reduces to Lemma 4. Suppose that the lemma is true for ao, oi, 
• • • , Or-i. Let fy{x) = oox" + ■ ■ • + ji{x) = fix) - fi{x) and 

divide (0, 1) into Am sub-intervals in each of which /i(z) is monotonic. It is 
sufficient to consider one of these sub-intervals, say, [a, h). We have 

I = f cos {/i(x) -H/jC®)) dx 

*'a 

= f COS fi(x) COS fzix) dx — [ smfi(x) 8 mf 2 ix)dx. 


We have only to consider the integral of cosines, say J. Divide (a, h) into sub¬ 
intervals in each of whose interior cos /i(x) is monotonic and does not vanish. 
The number of such intervals does not exceed (^tt)”^ | fi{h) — fi(a) j < 
(i^r^(l/iW I + l/iW I) < 2(1 Oo I + * • • + I Cfr-i I). Then, by the second 
mean-value theorem, 

J * I 

c»s Mx) dx 1 (a <bi <b). 

a I 

Hence, applying Lemma 4 to ftix), we get 

“'■-11 ) ^ I flo 1 + • • • + I Or-l 1 ) 

On the hypothesis of induction we have [ 7 | < jlm 1 P®" (t = 0, • • • , r — 1). 
If I o< I > I a, p*"* for some i < r, then \ l\ < .4„ | a, ; if | o< | < 

I Or then by (42), 1 7 | < Am 1 Ur p*'*'". The proof is therefore complete. 
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Lemma 6. Let fix) he the polynomial (38a) and gix) he aummable over (— «,«»). 


Then for every 

r we have 



< 43 ) 

lim f 6 *^^*^flf(x) dx = 

0 , uniformly in a,(t 


1 

ar|*-»ao •Loo 



Pboof: By! 

Lemma 5 We have 




lim t e"'*’ dx = 0, 

uniformly in a<(t 9 ^ 

r). 


|af|-~»Qe *'0 



Hence 




< 44 ) 

lim f dx = 0, 

|ar|-*«o Ja 

uniformly in a<(i ^ 

r) 


for if a 5*^ 0 and 6 0, then (a, h) is the sum or the difference of two intervals of 

the form (0, c) or (c, 0), and for the latter intervals the transformation x = dtcy 
reduces the interval of integration to (0, 1). 

Let G be any open set of finite measure. Then 0 is the sum of a sequence 
{/,} of non-overlapping intervals. Since Xml, = mG < », we have 

X) nilp < €, n > N. 

n 

Hence 

<*+1)1/ e"«dx 

a 1 Ji, 

•which, together with (44), implies 

(45) lim f dx = 0 uniformly in ai(i lA r). 

lari-** Jg 

Let S be any set of finite measure. Then there is an open set G such that G'D S 
and miG — S) < €. Hence 

I I dx < € + I f dx . 

I •'5 Mg’ 

Hence, by (45), 

<46) lim f dx — 0 uniformly in a,(i 5 ^ r). 

|Of|->00 •'5 

Now let h(x) be any positive ‘^simple’’ summable function, i.e. h(x) ^ a, > 0 
ioT X € S (v 1, 2, • • • , n) and h(x) = 0 otherwise. Since hix) is summable, 
each Sp must be of finite measure. Hence 

I f hix) dx < 2^ aJ /* dx 
I •Loo l»~l I ^3, 

which, together with (46), implies 

lim I hix) dx = 0 uniformly in Oi(t 7 ^ r). 

•Loo 



12 


P. L. HSU 


Finally, let g(x) be any summable function > 0. Then by a well-known theo¬ 
rem* we have g(x) ~ lim hn(x)y where {Kix) ) is an ascending sequence of positive 
summable simple functions. Hence 

I f dx\<\f dx \ + f (g{x) - K{x)) dx, 

1 <^00 1 I *^00 I •t-flo 


By monotonic convergence the last integral tends to 0 as n —► co. Hence 
I f ff(x) dx < ( + \ f hn(x) dx , 

1 J—eo I J-'oo 


which implies (43). If g(x) is any summable function, we have only to consider 
the customary expression of g(x) as the difference of two non-negative functions. 
This completes the proof. 

Lemma 7. Let P(x) be a non-singular distribution function of a random variable 
X, and let 


(47) p(ti , , • • •, <m) = / e dP, 


Then for every r and every positive constant c we have 

(48) l.u.b.|p(<,, < 1. 

\tr\^c 

Proof: We have P(x) = aiPi(x) + aiP^ix)^ where Piix) is absolutely con¬ 
tinuous, Pt is singular, ai > 0, ai -f 02 == 1. Hence 


lp(h,fe, < Oi|j[_ e ’■-* p’i{x) dx +Oj. 

By Lemma 6 we may find C > 0 such that 

I p{ti , ^, • • • , f«) I < iai + 02 < 1, if any | <,• | > C. 
Suppose that 

i.u.b. p(ei, •••,0 == 1, 

|<r|2£c 


then c < C and we must have 

(49) l.u.b. |p(<i, •••, «m)l = 1- 

Since pih, • * • , /«) is a continuous function, it must attain its least upper bound 
in any bounded closed set. It follows that there is a point (<?,•••, &) such 
that^ 0 (I 1 > c) and p{t\ , • • • , O - L But this implies that the 
distribution of is discrete, i.e. that the distribution of X itself is discrete. 


« H. Kestelman: Modern Theories of Integration (1937), p. 108. 
7 Cf. (C), p. 26. 
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which contradicts the non-singularity of P{x), Hence (49) is false and (48) is 
true. 

1«8« In his cited work Berry^ shows that if F(x) is any distribution function 
and if ^(x) is the function (6), then there is a constant a such that 

jj[ ’ -- {F(x + a) - ^(x + a)} dx 

where 5 = i/i l.u.b. 1 F(x) ~ 4*(x) I. This is easily extended to the following 
lemma, which needs no further proof. 

Lemma 8. Let Fix) he a distribution function and Fiix) be a function having 
the following properties: (i) Fiix) is bounded for all x, (ii) Fiix) —^ 1 as a; —► ^ 

Fiix) 0 as a: , (iii) Fiix) has a bounded derivative, | F[ix) | < M, Let 

^\Mh.\F{x) -Fr{x)\. 

Then there exists a constant a such that 

-551^ |F(x + a) - F,(x + o)} dx 

X* 

> 2Mn^ jJ” LzJ2if dx - t|. 

1 . 4 . In section 3 we define, for given €, k, X and z, a function 

(52) Gix, y) = if z < x < z + \y^, G(x, y) = 0 otherwise. 

The introduction of Gix, y) and the appraisal of its Fourier transform constitute 
the essence of our method of solving the problem of the asymptotic expansion 
of the distribution function Gix), The solution of the same problem about 
other functions of (1) alluded to in section 3 is based on the introduction of 
functions pla 3 dng the role of Gix, y). We now prove the following lemma: 
Lemma 9. Let Gix, y) be defined by (52) and let 

(53) (/(<,, fe) = r r y) dx dy. 

JL.QQ JL.QQ 

Then 

(i) 

(iii) 1 gih, <j) I S j^, i^ + • 



• (B), p. 128. 
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Proof; 

(i) 1 g(ti , fe) I < f y)dxdy = \ f /e"'*'* dy - ^ 

jRi •*-00 * 

(ii) Putting A; = 3 we have 

g(h , k) = ~ - 6-*“"*’) dy, 

I g{ti , < 2 ) I < I u(y)v'''(y) dy , 

where u(y) = v(y) = e”****'. On integrating by parts we 

obtain 

(64) I g(k , t 2 ) I < 1 ^, I viy)u"\y) dy | < ] u"'{y) \ dy. 

Elementary calculation establishes that 

I—^ < c~“'‘(216X«’ U1" + 756Xe® 1 y \ 

\h\ 

+ 336X€|2/l' + 8X’l<inyl’+ 12X*|<,||y|). 

Substituting in (54) and making the transformation y = we get the result. 

(iii) We have 

I ff(<x . fe) I ^ 1^1 I £ e-'’‘-"»'(l - e-'“^''’) dy 

Integrating by parts twice we obtain 

By elementary calculations we get 

I ff(« 1 , «s) I < i£i /" Hk\ey^ + 2fc(fc + 3)X€y“ + 4X* | (i | y’ + 2X)e-‘''“ dy 

which, on the transformation y = gives the result. 

1.6. We prove a few additional lemmas used in the proof of Theorems 3 and 4. 
Lemma® 10. Let u(xi , • • • , Xm) > Ohe summahle in the m-dimensiondl space 
and let 

(56) y(<i, • • •, O = f ••• f ““*”w(xi, ■••,x„)dxi dx™. 

•*—00 **—00 

* Although the author believes that this lemma is almost classical, a proof is given owing 
to lack of reference. 
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If is summahle in the mrdimensional spacCf then 

(66) «(xi, •. •, x„) = — C'"C .•••<») c«i ••• <«». 

Proof: Except for a constant factor the function u{xi , • • • , Xm) may be 
regarded as a probability density function. Hence by the well-known inversion 
formula of (55), 


j* ' * j* * * * * > ^tn) * * * dXfn 

(57) 

1 /*'“ r“ /-A- 

■■■ Lis Ui •••<«-. 


Now w(a:i, • • • , Xt^ is almost ever 3 rwhere the S 5 umnetric derivative of the inter¬ 
val function in the left-hand side of (57): 


u{xi, 


Hence 


‘ > ^m) — lim 


1 


e-*0 (2c)” 


/•••/ u{y,,- 

(♦—I,?,** 


■•,ym)dyi • • • dy„ . 

sm) 


(58) 


m(xi 


I Xin) 


(2 t)" Ills (2e)" 


r ■£ 

•*—00 •^ot 






tm) dti ••• dtm- 


Owing to dominated convergence the order of the limit sign and the integration 
sign in (58) may be inverted: Hence (56) is true. 

Lemma 11. We have 


- cos ^ Mr - |tl) 


if < T, 

if > T. 


(w /.■’""M 

Proof: The Fourier transform of the function in the right-hand side of (59) is 
■JT e*‘“(r - I < 1 ) (1 - cos Tu). 

A-j* U* 


Hence (59) follows from (56). 
Lemma 12. 


(60) 1 c({l + • • * + £n)* I ^ 

Proof. As (60) is true for A; = 1, let us assume, for induction, that it is true 
for 1, 2, • • • , A). Then, by symmetry, 

€(fl + • • • + = We{£l({x + + fn)*} = ^ 
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where U =* f 2 + ••• + {* • Since €(fi) = 0, we have 

<(!+■■■+ {»)*+* = ” § (j) 

On the hjrpotheses of induction we have | | < Ak{n — < 

. Hence 

[^(fi + • • • + £n)*^^| < klAkfi^^^'^^^^pr+iPk-^ < Ah+in}^*^"^^^Ph^i . 
Therefore the induction is complete. 


3. Elementary Proof of Theorem 1. 2.1 We have defined 


(61) F{x) = Pr[\/nl < x\, #(x) 

with the characteristic functions 


r 

/2t i-oe 




•\/2ir . 

V{t) = 6-“*. 




FoUowing Berry^® we use the equation 

(63) r {F(x) - *(x)}e*‘* dx = . 

•L*«e ““%t 

Let be the polynomial in (34), and let us define '^{x) as the function ob¬ 
tained from V'(^0 through the replacement of each power {ity by (—l)'4>^’'^(x). 

Integration by parts shows ( — 1)’^^ / dx = (^0’^V(0> whence 


(64) 


r ^{x)e**‘ 

•L-to 


dx 


-it ‘ 


From (63) and (64) we obtain 


( 66 ) 


r {F(x) - 4>(x) - 4'(x)}e*‘*<ix = - »(<){! + ^('(tO} ^ 

J-OO —tt 


The function ^(x) defined here is precisely the 'i^(x) appearing in (5) under 
Theorem 1. Our task is to prove that 


( 66 ) 


i F(x) — ^(x) ~ ^(x) I < 


, (*-2)/2 • 


Following Berry” we replace x by x + « iu (65), getting 
f {F(x + a) — $(x + a) — ^(x + o)}c*** dx 

J—tO 

_ e^-im - »(0{i + m)]] 


(67) 


(B), p. 127, Equation (23). 
» (B), p. 127. 
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multiply both sides of (67) by ST — | < | and integrate with respect to < in (—T, T); 
— cos Tz 


c- 




{F(a; + o) — + a) — ^(x + a)} dx 

-L 


— 


dt 


the reversion of order of integration involved is obviously justifiable. Hence 
— cos Tz 


( 68 ) 


f l - {F(x + a) - *(x + o) - 'Hix + a)) dx 

J-oo X* 


<r{ 


1/(0 - »( 0{1 + » W } 


dt. 


2.2. When in particular k = 3, (68) becomes 


(69) 


f- 


— cos Tz 


[F(z + a) — ^(x + a)) dx 




1/(0 - y( 0 | 


dt. 


If we choose a to be the a in (50), the left-hand side of (69) is not less than 
\/l C ^ d* - ir|. « = /|/f l.u.b. I Fix) - *(x) 1. 


On the other hand, taking T 
not greater than 


as in (36) the right-hand side of (69) is 
ft 


Aj^^e-***dt-^A. 

Hence 

(70) 

Now the left-hand side of (70), as a function of TS, is positive and increasing for 
sufficiently large 7"6, and becomes infinite as 2^5 — ^ . Hence (70) implies that 

T8 < Ay i.e. 


l.u.b.|F(x)-<i;(x)l^| 


Aft 


giving Theorem 2. 


2.3. Coming back to the general case, we see that the function ^(x) + ^(x) 
has a bounded derivative: | ^'(x) S[^'(x) | < Q*, and also has all the properties 

of the function Fi(x) in Lemma 8. On choosing a in (69) to be the a in (51) 
we obtain 
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(71) QkTS 
where 


{a I” ^ - .) < r f l /W - I 


5 = Qt l.u.b. I F(x) - Hx) - 'J'(x) I. 

Let us take T = (^t/Sr*'* wkh Ak in accordance with (34). Then 


(72) 


^ I 

Jo 


'//«) - ^(t){l+Hti)}I 


di 


= f + f , = Ji + /, 

^0 J 0*\/n 

By (34) we have 

(73) Ji<Qk f (<*”‘ + • • • + <’*~')e-*‘’ dt = Qk. 

Jo 

Also, 


2 say. 


(74) Ja < T ^ 


|p(t/\/n)|" 






+ r ^ 

J^kX^ 


v(011 + ^(io I 


t 


dt. 


The second term in the right-hand side of (74) is evidently <Qk . The first 
term does not exceed 


(75) 


Qtn*'*-” T l.u.b. i p(01". 

t^Qk 


At this step we make use of the non-singularity of P(x) and apply Lemma 7 
for m = 1. We have 

l.u.b. |p(<) I = e“^*. 

t^Qk 

Hence (75) does not exceed < Q* . We have therefore 


(76) 


T8<^ 


ri 

Jo 


— cos X 


dx-w}<Qk, T ^ 


Arguing with (76) as we did with (70) we conclude that 

l.u.b. I F{x) - #(*) - ^(x) I < ^ = ^. 

(72) is valid for T > 1. If T < 1, we have only to suppress the term . Hence 
Theorem 1 is proved. 

4. Proof of Theorem 3 and Theorem 4. 3.1. In connection with the random 
variables (1), we assume that ft* < «« for some integer k > 3 and define 


(77) »; = i S (£r - I)*, Oiz) 

n r-l 


= pj v^ fr - ^ ) < 

I \/a4 — 1 



Now, 

where 

(78) 
Hence 

(79) 
with 

(80) 
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X = Y^Vnl 

Wn V 04 — 1 

G{z) = Pr{X - XF* < 2 } 
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V«(a4 — 1) ‘ 


Let W be the probability function of the distribution of the random point 
{X, Y) and f(ti, < 2 ) be the characteristic function; 

(81) 1F(/S) = Pr{(X, y)eS) for every Borel set iS in Aj, 

(82) 

(83) pai, li) = [° e««<**-»«v^x)4.-2,x 

Let Gi{z) be the distribution function of X, Then 

(84) G{z) - Gi(z) ^ J jdW= K{z), say. 

Let 


*<x^»+Xy* 


(85) K,{z) = f f e-”'” dW. 

*<*^*+Xv* 

If we define (for fixed 2 ) the function G{x, y) by 

(86) G(x, y) = e""'** if 2 < a: < 2 + Xy“, G{x, y) - 0 otherwise, 
then 


(87) 
Letting 

(88) 


K.{z) = r rG(x,y)dW. 

r r e-*'‘*-“‘''(7(a;, y)dxdy= g{h, U), 

J-~to J-OO 
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we replace z by x — w in the integral and get 

(89) r r -u,y)dxdy^ , k). 

•Lao J-iO 


1 "*** COS Tu 

Multiplying both sides by- - - and integrating with respect to u we 

Vr 

obtain, with the help of (59), Lemma 11, 

f f e”*<i**^*a*' dx dy f - -- 0(x — u, y) du 

(90) 

Mr-lfi|)(;(ei,fa)if ti\ < T, 

10 if > T; 


the reversion of order of integration in the left-hand side is obviously justifiable. 
By Lemma 9 the right-hand side of (90) is summable in the whole plane of 
ih , ( 2 ), Hence, by Lemma 10, 


(91) 


1 — cos , 

- -2 - 0{x - u, y) du 

00 

l<i|sr 


K we integrate both sides with respect to the probability function W, we obtain, 
on reversing the order of integration, 

/ j J *“ 

^ J (T — I tx\)g{ti , , ^2) dtidt2 . 


(92) 


By (86) and (87), 

(93) r r G(x - w, y) dW = K.(u + 2 ). 

*•00 tt-oo 

Hence 


f * 1 — cos ^ , I ^ j 
^- K.iu + z)du = 

We now take the functions 
(96) <p(ti, k) = 


|*i|sr 

-i(«f+<|+2p«i<2) 
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and ypiUi , iii) as in (36), where 

Since the condition 04 — 1 — agT^Ois assumed in Theorem 3 and implied in 
Theorem 4, we have | p [ < 1. Let 

(97) w{x, y) = 

and let y) be the function obtained from ^(t7i, ih) through the replacement 
of each power (lY,)" (ffe)’* by (-l)'‘+’W,.,.(x. y) = (-!)'•+'* 




we have 


(99) tP,.,,(x, y) = (ttO'Ht'**)'’*"*"*""'V(b ,k)dtidl,, 

whence, by Fourier inversion, 

(100) (j<i)'‘(*fe)'V(b ,k) - [ f e‘'‘*^“”'w,,„(x, y) dxdy. 

•^•p J—to 

From the definition of y(x, y) it follows therefore 

(101) f f 6*'*'^“** {w{x, y) + y(x, y)\dxdy = <p{ti , 1») |1 + ^(iti, ii*)). 

*^-•0 J-00 

A comparison of (101) with f f e**^*'’'****' dW = f(ti , < 2 ) shows that (94) will 

J-*o A-eo 

remain true if K^{u) be replaced by 

(102) J j e"*''’*(w(x, y) + y(x, y)) dxdy = L,(u), say, 

u<xitt+Xv* 

and f{ti , ti) be replaced by ^(^i, /j) {1 + , ^^ 2 )}. Hence 

r 1 {K.(u + z) - L.(u + z)}du 

J-vo Vr 

^ he / / ■ I 


, ^2)[1 + ^(t^i, ^^ 2 )]} dlidU • 
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Let also 

(104) £?(«)= j j {w(x, y) + y(x, y)} dxdy, 

Hi(z) = f f {^(a:, y) + y(x, y)\dxdy, 

X^$ 

(106) L{z) = H(z) — Hi{z) = j j {w{x,y) + y(x,y)] dxdy. 

9 <x^M+Xy* 

3.2. We now consider the particular case A; = 3 and prove Theorem 3. For 
A; = 3 we have ^ = 7 = 0 and so 


(106) 


Hiz) = f f w{x, y) dx dy, 

*—Xi/* ^ * 

IIi{z) - j j w{x, y) dxdy = $( 2 ), 

X^Z 

m = H{z) - Hx{z), 


(107) 


(108) 


L,{z) = j j ^ *''*w(x, y) dx dy, 

*<*:S»+Xy* 

f ^ ~ {iir,(u + x) — L,(u + x)} du . 

J-00 W 


Now 

iiC.(w) - = lO(u) - <^(u)} - {H(u) - #(u)} - {(?i(u) - ^(u)} 

~ {K(u) - K.(u)} + lL(u) ~ L.(w)}, 


1 /•» /•u+Xy* 

0 < HM - *(«) . /_ 

^ Wl - P- /■ * ' V g( r- >) ■ 


|Oi(«)-4>(«)l<:^/^ by Theorem 2 , 

0 < iiiCw) — Kt(u) < €e(F®) < by Lemma 12, 

0 < L(u) — L,(w) < Ac. 
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Hence 


/ - -521?^ j(?(u + X) — $(u + X)) du 

•t-00 


(109) - {“•' + (a, - r)*v« + V(a4 - lyCf--?) 

+ GT ^ J I gf(^i, < 2 ) M/(^i > W ) ^ 2 ) I dtidti . 


It is easy to verify that 
at 

\ 8/2 


+ 




1 


(«4 - 1)*'" ' ViOi - 1)(1 - p^) 


\a4 — 1 — as/ 


For the left-hand side of (109) we refer to (50) and take a; to be the number a 
therein. Hence 

- '} s "(»• + 

(110) + AT J J 1 ^(^1, ^ 2 ) I*1/(^1, ^ 2 ) — ^(ii, ^ 2 ) 1 dtidt2 

+ AT J J \g{f>i y i^\dtidt 2 , 


\tl\^T,\t2\^T 




By Lemma 9 (ii) we have 

T j J \g(ti,t 2 )\dtidh 


(111) < AT 


JI 


|<iisr.|<a|>r 




Hence 


Td 


(112) 


+ AT J J I g(ti , ^ 2 ) 11/ — ip\dti dh . 
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By Lemma 9 (i) with A; = 3 we have 

(113) T fl \g\.\f-v\dtrdh<^ If \f-<p\dtidh. 


By (37) under Lemma 3, 




(114) 




with 


(116) 




- 1 

\/a£4 — 1 




Wfe now take 

the A coinciding with that in (114). Then 

A(1 — p*)\/n ^ A(1 — p*)(tt« — l)*\/n 

(117) 


Pn Son 

A(a:4 — 1 — a\)-\/a* — 1 Vn ^ A(a4 — 1 — a|)*-\/n _ 


(118) 


8a, 

A(1 — p*)y/n _ Ajoi — 1 — ai)\/n 
|S« (a, - Dft 


8a!« 


= T 


A (a, — 1 — a|)* \/ w V. A(a4 — 1 — a|)*\/ Ti 


^ — J. — u,/ V IT 

— „»/* • 

a, P, a. 


Hence (114) is true for | h 1 < T and | <, | < T. Using this fact on (113) we 
obtain 


( 119 ) 


T Jf \g\\f - <p\diidh 

i £ £ {sr^TT. I '■ r+o-i <• I’} * * 

^ AT), f a* . 1 

ATX 1 

“ ~ (a, - 1 - oj)**'* 


_ 1 

(a, Vo. - 1 + ft(«. - 1)*) „|)W* 


:< 


ii/e 


ATc^ 

n-v/;(«4 - 1 - a\f*' 
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Substituting in (112), setting < = {o(%T) ‘ and using (116) we obtain after some 
easy reduction 


T6 


( 120 ) 


' VntoT—~1) 1 “ «•)) (n(tti — 1 — ffli))' J ■ 


If n > (04 ~ 1 — as) , then the ri^t-hand side of (120) is < A, and so, 
arguing with (120), as we did with (70), we obtain 


(.21) I g(.) - *w u ^ ^ _ T_ „i )'- 

For » < ((*4 — 1 — alr^at, however, the ri^t-hand side of (121) > il(tM — 
1 — aj)“*ae > aud (121) becomes a triviality. Hence Theorem 3 is proved. 
3,3. To prove Theorem 4, we start again with the identity (103). We have 

K.(u) - L.(w) = {(?(«) - H(«)} - {(?i(w) - H,(«)} 

( 122 ) 

- {K(u) - K.{u)} + \Uu) - L,{u)], 

(123) 0 < fi:(w) - K,{u) < e«(y**) < Quf by Lemma 12, 

(124) 0 < L(m) - L,(u) < t f f y^(w(x, y) + | •y{x, y)\) dxdy-^ Qkt. 

•*—00 •*—00 


Let us show that 

(126) I (?,(m) - Hx{u) I < <?*/«»<*-*>. 

The function X == ^ f )ha8 the same structure as y/n J (with 

(oi ~ l)‘^(i? — 1) playing the role of {<); hence, by Theorem 1, there exists 
an asymptotic expansion of the distribution function Gi(u). We shall see that 
the terms of this asymptotic expansion are precisely Hi{u), whence (125) foDows 
from Theorem 1. 

It is obvious that for the polynomial ^(t«i, ife) in (35) ^(if, 0) coincides with 
the polynomial ^(t'O in (34). Hence the terms of the asymptotic expansion 
of G\{u) are the inversion of e”*** {1 + ^(ii, 0)} viz. 


(126) 


*(w) + ^ e-*‘*-*'V(t<, 0) dt. 


On the other hand, by (104), 

(127) Hi{u) * *(«) + y(^>y) 


and by (101) with <* = 0, 

(128) f e**’’dx f yix,y) dy 

•L-co 
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Inversion of (118) gives 


(129) 


r 7(x. u)d!/ = ^ £“ e-*‘*-“V(f<, 0) dt 


which establishes the equality of Ift(u) and (126). 
Using (122), (123), (124), (125) on (103) we get 

cos Tu 


(130) 


£ ^ 7 - + *) 1 d« = A,T (* + 

+ OT J J I g(ti , (2) I *1/(^1, ^2) — <p{ti , ^2)[1 4 “ , it2)] 1 dtidt2 . 


If wc expand 

(131) H(u) == f f y) + yi^^y)} dxdy 

-V«2;gU 

in powers of n~* up to and including the term the remainder is obviously 

Hence 

(132) im = ^{u) + x{u) + 

where #(w) + x(w) is the group of terms of the Taylor expansion of (131) in 
powers of n“* up to and including the term From (130) and (132) we get 

^(G(u + z) — #(m + «) — x(m + z)) du 

where 

(134) I ^ T J J I g(ti , ( 2 ) I *1/(^1, < 2 ) — (p(ti , ^ 2 ) {1 + , fe)} I diidtn . 

|ti|!sr 

We are going to prove that the function xM here defined satisfies all the 
requirements of the function x(^) Theorem 4. The structure of xM an¬ 
nounced in Theorem 4 is easily verifiable. It remains to prove the inequalities 
(15) and (16) satisfied by 

I G(u) - 4»(w) - xM 1 • 

It is obvious that the function 4>(w) + x(w) has all the properties of the 
function Fi{u) in Lemma 8, having a bounded derivative 14>'(w) + x'(w) | < Qifc. 
Hence, on taking z in (133) to be the number a in (51), the left-hand side of (133) 
does not exceed 

Q,TS (s ^ dw - v), « = Qtl.u.b. I G(u) - $(m) - x(m) 1- 




DISTBlBXmONS OF MSAN AND VARIANCS 


27 


Hence 


(1S5) + 

In order to appraise I we recall (35) under Lemma 3 (replacing therein each 
0ki by the larger number ffkifikt , and merging the latter into Qk) 

\f(ti, fe) - v(ti ,«*){! + mi, *fe)} 1 < T&j {2(1 <<!*+••• 

(136) 




+ kr"))^ 

for 

(137) \ti\< QkVn. 

Put T = {Qky/n)\ with Qk here coinciding with that in (137) and then (136) 
is valid for | /i | < and \ U\ < . Write 

I-T // +!■ // +T // 

liil |*i|:sr 

By Lemma 9 (i), 




whence, by (136) 




Q^T 

•® ^ wK*-»v/a ' 


By Lemma 9 (iii) we have 


h<QkT If 

lii|sr.|«j|>r»'» 

+ {l/(h > fe) 1 + , tj) I 1 + 4'{ih > tfa) 1} dtidit 


Obviously, 

(140) 


I.u.b. v(ti , fe) 11 + mi . tfe)| = e"”®*- 


On the assumption of non-singularity of F(x) we have, by Lemma 7, 
l.u.b. |/(<i, fe)l = 1-u-b- IP {:^ > 77^ I 

/ t, \1- 

(^’Vl “ 
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Hence 


h < QkTe- 


f I I feV (vi'** 

n A*'" ^ 

y^i/jiifc ^sTiSfc J ‘ 


For Is we have Ui | > = Qky/ n, and so Lemma 7 is applicable to h in the 

same manner as to h . Using I^mma 9 (i) on the factor | g{ii , ts) | we get 


Combining (135), (138), (139), (142), (143) we obtain 


H"'- 


< Qk 


^i/2_i_ ^ I ^ 

^ ^ y^Hk-2) niik-l)^Z/2k 


^3/2(i-l) I \ 

4-_!L^ 

‘ \A/2k “■ ^2f2k~ ‘ ^V2kl 


Putting € = ^k{k-i)i( 2 i^ 3 ) wc get, as the last term in (144) is < Qk , 

” (’ r * - ») S a + «.»“’ (;f54r«i + ■ 

If 4 < A; < 6, we take Z = /c — 2 and get 


‘H”' 


< Qib + Q/ 


f_L 


+ 1) < Q. 


Hence, by the argument following (70), 

l.u.b. 1 G(«) - m - x(n) I < f = , 

giving (15). If fc > 7, we take I = g- and get 




Q* + Oifc (1 + 


,(*-6)/(2(Jfc+8)) 


)< 0 *. 


Hence 


l.u.b. 1 G(u) - m - x(M) 1 < I = , 


giving (16). Therefore Theorem 4 is proved. 

5. When a 4 — 1 — of| — 0. If a 4 — 1 — as == 0, then there is unit probability 
that (i assumes exactly two values: 

Pr{f,- = a} = p, Prjfi = 6} = g, p + g = i. 
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Let {■< = 1 with probability p and = 0 with probability q. Then {< =* 5 + 

(a — 6)f,', i; = (o — 6)’ - S(f< — f)’. Hence it is sufficient to consider the 
n 

variable - £ (f< — f)* = v- Letting 2fj = r = np + Xwe have in - 

Th 

y.* _ 

r -- npq + (g — p) V npq X — pqX^. We now consider two distinct cae^: 

u 

Case (t). p 9 ^ q. Here 

= Pr|(X + cVn)* > cV - 2|c|\/n«}, ® 

Thus F{z) = 1 if 2 > J I c I y/n. If « < J | c | y/n, then 

F(z) = Pr{X < -m - (c®n - 2 I c I y/^)*\ 

+ Pr[X > -cy/n + (c*n - 2 | c | y/hf\ = Pi( 2 ) + F^iz). 

To the random variable X Theorem 2 can be applied. Suppose that c < 0; 
then, by Tchebycheff’s inequality, 


P^iz) < Pr[X > -cn} < 


c^n (p — g)%’ 


By Theorem 2, 

Pi{z) = Pr{X < —m -- (c^n — 2\c\'\/nz)^} 


Hence 


= ^z) + 


Vn\p - q\ 


V npq 


The same inequality holds also for c > 0. 

Case (zi). p — q = 1/2. Here 7?i = i(^ “ X^); hence 


(146) > V} - S 'I = 

There is no asymptotic expansion for the distribution function of qi. (See 
(C), p. 83.) 



SAMPLING INSPECTION PLANS FOR CONTINUOUS PRODUCTION 
WmCH INSURE A PRESCRIBED LIMIT ON THE OUTGOING QUALirY 

A. Wald and J. Wolfowitz 
Columbia University 

1. Introduction. This paper discusses several plans for sampling inspection of 
manufactured articles which are produced by a continuous production process, 
the plans being designed to insure that the long-run proportion of defectives 
shall not exceed a prescribed limit. The plans are applicable to articles which 
can be classified as ‘‘defective’* or “non-defective” and which are submitted for 
inspection either continuously or in lots. In Section 2 the notions of “average 
outgoing quality limit” and “local stability” are discussed. The valuable con¬ 
cept of average outgoing quality limit for lot inspection is due to Dodge and 
Romig [4], and that for inspection of continuous production to Dodge [1]. Sec¬ 
tion 3 contains a description of a simple inspection plan (SPA) applicable to 
to continuous production and a proof that the plan will insure a prescribed 
average outgoing quality limit. Section 4 contains a proof that this inspection 
plan also has the important property that it requires minimum inspection when 
the production process is in statistical control. In Section 5 is contained the 
description of a general class of plans which possess both these important proper¬ 
ties. 

The problem of adapting SPA to the case when the articles are submitted for 
inspection in lots instead of continuously, is treated in Section 6. Some methods 
of achieving local stability are discussed in Section 7 and a specific plan is devel¬ 
oped there. Finally Section 8 discusses the relationship between the present 
work and that of the earlier and very interesting paper of H. F. Dodge [1], 
mentioned above. 

If a quick first reading is desired the reader may omit the second half of Section 
3 (which contains a proof of the fact that SPA guarantees the prescribed average 
outgoing quality limit) and the entire Section 4 except for its title (the proof of 
the statement made in the title of Section 4 occupies the whole section). 

2. Fundamental notions. In this paper we shall deal only with a product 
whose units can be classified as “defective” or “non-defective.” We shall 
assume that the units of the product are submitted for inspection continuously, 
except in Section 6, where we assume that they are submitted in lots. Through¬ 
out the paper we shall assume that the inspection process is non-destructive, 
that it invariably classifies correctly the units examined, and that defective units, 
when found, are replaced by non-defectives. By the “quality” of a sequence of 
units is meant the proportion of defectives in the sequence as produced. By the 
“outgoing quality” (OQ) of a sequence is meant the proportion of defectives 
after whatever inspection scheme which is in use has been applied. If this 
scheme involves random sampling, then in general the OQ is a chance variable. 

30 
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(It depends on the variations of random sampling.) Ji the OQ converges to 
a constant pa with probability one as the number of tinits produced increases 
indefinitely, pa is called the ‘‘average outgoing quality” (AOQ). The AOQ 
when it exists is therefore the average quality, in the long run, of the production 
process after inspection. It is a function of both the production process and the 
inspection scheme. These definitions are due to Dodge [1]. 

The “average outgoing quality limit” (AOQL) is a number which is to depend 
only on the inspection scheme and not at all on the production process. Roughly 
speaking, it is a number, characteristic of an inspection scheme, such that no 
matter what the variations or eccentricities of the production process, the AOQ 
never exceeds it. For the purposes of this paper we shall need the following 
precise definition: Let C{ be zero or one according as the ith unit of the product, 
before application of the inspection scheme, is a non-defective or a defective, 
respectively. Let di have a similar definition after application of the inspection 
scheme. (We note that if the ith item was inspected, then di = 0; if the tth 
item was not inspected, then c< == .) The sequence c = Ci, Cs, • • • , Cj^r, • • • , 
ad inf. characterizes the production process^ The elements of d = di, ^ 2 , • • * , 
ad inf. are in general chance variables. The number L is called the AOQL if it 
is the smallest^ number with the property that the probability is zero that 

N 

JLdi 

lim sup - > L, 

AT iV 

no matter what the sequence c. 

It should be noted that this definition of AOQL places no restrictions whatever 
on the production process, since all sequences c are admitted. It is too much 
to expect a production process to remain always in control; indeed, doubt as to 
whether statistical control always exists may cause a manufacturer to institute 
an inspection scheme. The insf)ection schemes which we shall give below will 
yield a specified AOQL no matter what the variations in production are. If 
these schemes are employed, then, even if Maxwell’s demon of gas theory fame 
were to transfer his activities to the production process, he would be unsuccessful 
in an effort to cause the AOQL to be exceeded. A dishonest manufacturer might 
sometimes essay to do this. If we imposed restrictions on the sequence c and 

* This use of an infinite sequence to describe the production process deserves a few words. 
What we consider in this paper are schemes^pplicable when the number of units produced 
is large and operate mathematically as if the production sequence were of infinite length. 
Naturally the latter is never the case in actuality. However, the larger the number of 
units produced the more nearly will the reality conform to the results derived from the 
mathematical model. While the present definition uses explicitly the notion of an infinite 
sequence, such a commonplace statement as '*the probability is 1/2 that a coin will fall 
heads up* * uses this notion implicitly. It is also implicit in the intuitive meaning we ascribe 
to such a word as ‘‘average,” which is in every day use. 

* It is not difficult to see that such a number always exists, for it is the lower bound of a 
set which is non-empty (it contains the point one), bounded from below (zero is a lower 
bound), and closed. 
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determined the AOQL on that basis, we would run the danger that the relative 
frequency of defects in the sequence of outgoing units might exceed the AOQL if 
it happened that the actual sequence c did not satisfy the restrictions imposed. 

After we discuss below various possible sampling inspection plans which 
insure that the AOQL does not exceed a predetermined value L, it will be seen 
that for any given L > 0 there are many sampling inspection schemes which do 
this. To choose a particular sampling plan from among them the following 
considerations may be advanced: If two inspection plans S and S' both insure 
the inequality AOQL < L and if for any sequence c the average number of 
inspections required by S is not greater than that required by S' and if for some 
sequences c the average number of inspections required by S is actually smaller 
than that required by S', then S may be considered, in general, a better inspec¬ 
tion plan than S'. However, the amount of inspection required by a sampling 
plan is not always the only criterion for the selection of a proper sampling 
scheme. There may be also other features a sampling plan which make it 
more or less desirable. We shall mention here one such feature, called ‘‘local 
stability,” which will play a role in our discussions later. Consider the sequence 
d obtained from the sequence c by applying a sampling inspection scheme. Even 
if the AOQL does not exceed L, it may still happen that there will be many large 
segments of the sequence d within which the relative frequency of ones is con¬ 
siderably higher than L. For instance, it may happen that in the segment 
(di, • • • , dm) the relative frequency of ones is equal to |L, in the segment 
(dm+i , * • • , d 2 m) the relative frequency is equal to JL, in the segment (djm+i, 
• • • , dzm) the relative frequency is again equal to fL, and this is followed again 
by a segment of m elements where the relative frequency of ones is equal to JL, 
and so forth. If m is large, such a sequence d is not very desirable, since each 
second segment will contain too many defects. A sequence d is said to be not 
locally stable if there exists a large fixed integer m such that the relative frequency 
of ones in (djb+i, • • • , d*.+m) is considerably greater than L for many integral 
values k. On the other hand, the sequence d is said to be locally stable if for 
any large m the relative frequency of ones in {dk+i , • • * , dk+m) is not substan¬ 
tially above L for nearly all integral values k. This is clearly not a precise 
definition of “local stability,” but merely an intuitive indication of what we want 
to understand by the term, since we did not define what we mean by “large m,” 
“many values of A;,” “considerably above L,” etc. A precise definition of local 
stability will not be needed in this paper,•since it is not our intention to develop 
a complete theory for the choice of the sampling plan. The idea of local stability 
will be used in this paper merely for making it plausible that some schemes we 
shall consider behave reasonably in this respect. A similar idea, called “protec¬ 
tion against spotty quality,” is discussed by Dodge [1]. A possible precise defini¬ 
tion of local stability could be given in terms of the frequency with which F{N) — 


1 

(k ■+“ 1) 


5 


di (k being fixed) lies within given limits. 
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3. A sampling inspection plan wl^ insures a given AOQL no matter wliat 
tile vaiiatlons in tiie production process. The only feature of the sampling 
(inspection) plan (SP) studied in this section and hereafter referred to as SPA 
which we shall consider here is that it insures the achievement of a specitied 
AOQL. Considerations leading to a choice among several schemes are postpmied 
to later sections. 

For convenience, let / be the reciprocal of a positive integer. SPA calls for 
alternating partial inspection and complete inspection. Partial inspection 
is performed by inspecting one element chosen at random from each of successive 


groups of j elements. Complete inspection means the inspection of every 

element in the order of production. SPA is completely defined when a rule 
is given for ending one kind of inspection and beginning the other. 

It is clear that all SP need not be of the above class. Thus, for example, a 
scheme might consist of partial xaspection with various fs employed in various 
sequences. We make no attempt in this paper to examine all possible schemes. 
For simplicity in practical operation, alternation of complete inspection and 
partial inspection with fixed / would seem reasonable. The Dodge scheme [1] 
is of this type. 

We shall also not discuss the question of a choice of the constant /, but will 
assume that a particular value has been chosen for various reasons and is a datum 
of our problem. Reasons which might influence a manufacturer in his choice 
of / could be contract specifications which impose a minimum on the amount of 
inspection, or psychological grounds to the same effect. The manufacturer 
may desire a certain minimum amount of inspection in order to detect maL 
functioning of his production process. Also / controls local stability to some 
extent. The consequences of a choice of / as they appear in the theory below 
may also play a role. 

Returning to SPA, we begin with partial inspection. Let L be the specified 

AOQL. Denote by kjf the number of groups of j units in which defectives 

were found as the result of partial inspection from the beginning of production 
through the Nth unit. SPA is as follows: 

(a) Begin with partial inspection. 

(b) Begin full inspection whenever 


€n = 



> L. 


(c) Resume partial inspection when 

ey <L. 

►(d) Repeat the procedure, (It will be recalled that defective units, when 
found, are always to be replaced with non-defectives.) 
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It is to be observed that in this plan the number of partial inspections increases 
without limit. For, while complete inspection is going on, the value of hs 
remains constant, so that after a long enough period of complete inspection the 
denominator N of the expression which defines Cn will have increased sufficiently 
for Ch to be not greater than L, On the other hand, complete inspection may 
never occur. This will be the case if, for example, no defectives or very few 
defectives are produced. 

We shall now show that the AOQL of the above SP is L. We first note that. 


at Nj ey can increase only by 


(H 


Hence, for sufficiently large N, ey < 


L + €y where e > 0 may be arbitrarily small. 

Suppose now that the production process is subject to any variations whatso¬ 
ever, i.e., the sequence 

c C\ , C2 > * * * } > * * * ) ad inf. 

is any arbitrary sequence whatever (by their definition the Ci are all zero or one). 
Our result is therefore proved if we show that, with probability one. 


lim (en — = 0 

JV-op \ iV / 


for this arbitrary c, and that for at least one c 
(3.2) lim ey = L. 


Let S{N) be the number of groups of j units which have been partially in¬ 
spected through the Nth unit. Define Xi as zero if in the ith partially inspected 
group a non-defective was found and as one if a defective was found. We have 

8(N) 

ky = ^ Xi . 

t-1 

Since the number of times partial inspection takes place increases indefinitely, 
S{N) 00 as iNT 00 . Also S{N) < JN < N, Let aj be the serial number 
of the last unit in the jth partially inspected group. Then for all j the expected 
value Eixj) of Xj is given by 


Eix,)=f( 2 , o): 


We have, for all j 


aj 


(c»* di) — Xj 


so that 
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Also from (3.3) it follows, since Xj is the value of a binomial chance variable 
from a population of fixed number 
such that 


G) 


jj , that there exists a positive constant 0 


(3.4) 



where a^{x) is the variance of a chance variable x. Now a theorem of Kolmo- 
goroff (Kolmogoroff [2], Fr^chet [3], p. 254) states: 

A sequence of chance variables with zero means and variances <r?, cr?, • • • 
converges with probability one towards zero in the sense of C^ro if 


(3.5) 



converges. The inequality (3.4) permits us to apply this theorem to the se¬ 
quence of chance variables of which the jth (j = 1, 2, • • • ad inf.) is 



with probability one, 



since the units which are fully inspected contribute nothing to Sd*. Since 
S(N) < N, the desired result (3.1) is a fortiori true. 

If c is such that all the c, are one, it is readily seen that (3.2) holds. If many 
(this adjective can be precisely defined) defectives are produced, this will also 
be the case. This completes the proof of the fact that the AOQL of SPA is L 
no matter how capriciously the production process may vary. 


4. When the production process is in statistical control, SPA requires minimum 
inspection. The production process is said to be in statistical control if there 
is a positive constant p < 1 such that, for every f, the probability that c< = 1 
is p and is independent of the values taken by the other c’s. We shall see that 
if the process is in statistical control and if SPA is applied to it, the specified 
AOQL is guaranteed with a minimum amount of inspection. 

The number of units inspected through the Nth unit produced is 

(4.1) /(AT) = AT - 0 - l)s(Ar). 

If the process is in statistical control we have, with probability one, 
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(4.2) 


lim 

N-*«o 



V 


by the strong law of large numbers. Shortly we shall prove the existence of a 
constant L* such that, with probability one, 


(4.3) 


lim 

N-*<n 


N 

Zdi 

N 


L*. 


Assume for the moment that this is so. Since it is only by inspection that de¬ 
fectives are removed, and the units selected for inspection are in statistical con¬ 
trol like the original sequence, it follows that, with probability one. 


(4.4) 


lim ^ 1 (p - L*) = 1 

iv-»oo A p 


V 


because, with probability one, 

N 


lim 

Ar-»oo 


E 




(Ci - di) 


= p - L*. 


Inspection is therefore at a minimum when L* is at a maximum compatible 
with the specified AOQL. By (4.3) the latter means that 

(4.5) L* < L. 


SPA has been shown to guarantee this requirement. The optimum situation 
from the point of view of the amount of inspection would therefore be to have 
L* = L, but this cannot always be achieved. The absolute minimum amount 
of inspection clearly is /, i.e., partial inspection exclusively. Consequently 
from (4.4) 


so that 


1 


V 


>f 


(4.6) L* < p(l - /). 

Combining (4.5) and (4.6) we see that we have to consider three cases: 
Case a. If 

(4 7 ) 

we have to show that 


(4.8) 

Caseb. If 

(4.9) 


L = L*. 


P < 


L 


1 -/ 
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we have to show» by (4.4), that 



that is, 

(4.10) L* = p(l - /). 

Cose c. If 

(4.11) p = 
we have to show that 

(4.12) L = L* = p(l - /). 

Proof of (4.8): We have already remarked in Section 3 that in SPA partial 
inspection always recurs, but complete inspection need never occur. We shall 
show in a moment that (4.7) implies that no matter how large an integer y 
is chosen, the probability of temporarily stopping partial inspection for some 
W > 7 is one. Assume that this is so. Choose an arbitrarily small positive 

6 -') 

€, and let y > ^ - - . For a sequence where complete and partial inspection 

alternate infinitely many times let 

A = ai, a 2 , * • • , ad inf. 

be the sequence of integers at which partial inspection ends, and let 

^ = /?i, i52, • • • , ad inf. 

be the sequence of integers at which complete iaspection ends. Then, for all j, 

«;+i > ^3 > a/ • 

From the description of SPA it follows that, for all A’' > y which belong to either 
A or By 

(4.13) I Cat — L \ < €. 

In Section 3 we proved 

(3.1) = 0 

with probability one. Since e is arbitrarily small it follows that, with probability 
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To complete the proof of (4.8) we have still to show that L* exists and that the 
probability is one that complete inspection will occur infinitely many times. 
First we prove that L* exists. 

If 

As N increases during an interval of complete inspection, D(N) = 23 d* 

t-i 


remains constant. 


Hence 


D(N) 
' N 


decreases monotonically. 


Since for the ends 


of such intervals (4.14) holds, it follows that (4.14) holds as iST oo and is a 
member of A, J?, or an interval (a/, /3y) for all j. 

Let A” —> 00 while always being in the interior of an interval 09,, a,+i], j = 
1,2, • • • , ad inf., which contains a,+i but not jS,-. Let N* be the total number 
of units in these intervals through the Nth unit produced. Let Ni and N 2 be 
such that 


Then 


/3,- = Ni < Nz < aj+i . 


Nt - Nt = Ni - Ni. 


Since the production process is in statistical control, we have, by the strong 
law of large numbers. 


(4.15) 


lim 


D(N) 

N* 


P(1 -f) = p' 


with probability one. Let S* be the general designation for numbers <« in 
absolute value, so that all S* are not the same. With probability one for almost 
all N, we have by (4.15) 


Dm 

Nt 


V' + s* 


Dm 

Nt 


= p'+ S*. 


Write 


Now 


\Dm - Dm] _ fc 

(Ni - Ni) 


D(Ni) 

'W 


Dm + [D(Nj) - Dm] Dm + ipm - pm] 

Nt + {Nt - Nt) “ Nt + (Ni - Ni) 


(p' + d*)Nt + K(Ni - Ni) 
Nt + (Ni - Ni) 


= p' + 


Hence 

(4.16) 


Km - Ni) = 2B*Nt + (^' + S*)m - Ni). 



SAMPLING INSPECTION PLANS 


39 


Now suppose (4.3) does not hold. From the definition of AOQL it follows that 
for some ij > € there exist sequences (whose totality has a positive probability) 
so that, for infinitely many N 2 we have 

(4.17) ^ ^ _ 

Nt Ni + {Nt - Ni) 

For large enough Ni , from (4.14), 


= + 


with probability one and hence, using (4.16) in (4.17) 

(4 18) ~ 

< LNi + UNi - Ni) - 4vNt 

from which, using the fact that p' > L (from (4.7)), we get 

(4.19) NiS* + 2NU* + i*{Nt - Ni) < -4ijiVj. 

((4.18) and (4.19) hold for the sequences for which (4.17) holds, except perhaps 
on a set of sequences whose probability is zero.) Since Nt < Ni and | 5* | < 17 , 
we have, on the other hand. 


(4.20) 


NiS* + 2NU* + - iVi) > -SvNi - r,(Nt - Ni) 

> —417^*1 — 4i7(A’2 — Ni) = — 4174^2 


which contradicts (4.19) and proves the desired result ((4.3) and (4.8)), except 
that it remains to prove that, no matter how large 7 , the probability of tempo¬ 
rarily stopping partial inspection at some iV > 7 is one. Let 70 > 7 be some 
integer at which partial inspection is going on. From (4.2) and (4.7) it would 
follow, if partial inspection never ceased on a set of sequences with positive 
probability, that, on this set, with conditional probability one, for N sufficiently 
large and e sufficiently small. 


ks ky^ ^ ±J 

m - 70 ) r=j 


+ 


- - {I- /)*, 


N - yo fN 


N 


N 


Civ > L+ 

This contradiction proves that complete inspection is eventually resumed and 
completes the proof of minimum inspection in Case a. 
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Proof of (4.10): We shall prove that (4.9) implies that, with probability one, 
complete inspection will cease, never to be resumed. For, from (4.15) and 
(4.9) it follows that for N sufficiently large and « sufficiently small. 


(4.21) 


D{N) 

N* 


p' + 6* <L - 2€. 


Hence, a fortiori, 
(4.22) 


D(N) 

N 


< L - 2€. 


((4.21) and (4.22) hold with probability one.) 

(3.1) states that, with probability one, 

Hence for all N sufficiently large, with probability one. 


Cjv < 


i.e., with probability one complete inspection is never resumed. 

When (4.9) holds, therefore, with probability one and with a finite number of 
exceptions SPA will require only partial inspection. 

L 

Proof of (4.12): If p = ~~f complete inspection finally never resumes, 


then (4.12) follows easily. If p = partial and complete inspection 

alternate infinitely many times, then the proof is similar to that of (4.8) and is 
therefore omitted. In either case the desired result follows. 


6. A class of SP all of which insure both a given AOQL and minimum inspec¬ 
tion. Let the definition of SPA be modified in the following particulars: 

(b) Begin full inspection whenever 

cj. = ' > L + ,I>(N). 

(c) Resume partial inspection when 

es < L - ^(AT). 

Let and i>(N) be such that 

-UN) < «(A0 

lim = Ihn tKN) = 0. 

JV-*oe 

(SPA corresponds to the case ^(iST) s ^(AT) m 0.) Then all the SP of this class 
have the property that the AOQL is L and that inspection is at a minimum in 
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the sense of Section 4. The proofs are essentially the same as those for SPA 
and hence will be omitted. 


6. The inspection plans of Section 5 can also be applied to lot inspection* 

We shall carry on the discussion of this section in terms of SPA, but the results 
apply to all the members of the class of plans described in Section 5. We shall 
show that SPA can also be applied when the product is submitted for inspection 
in lots. Although we assumed previously that the units of the product are 
arranged in order of production, the results obtained for SPA remain valid for 
any arbitrary arrangement of the imits. If the product is submitted in lots we 
may arrange the units as follows: Let , Z 2 , • • • , etc. be the successive lots in 
the order of their submission for inspection. Within each lot we consider the 
units arranged in the order in which they are chosen for inspection. In this way 
we have arranged all units in an ordered sequence and the inspection can be 
applied as described before. Thus, we start with partial inspection, i.e., we 

take out groups of j elements in h and inspect one unit (selected at random) 

from each of these groups. When ey > L, we start complete inspection and 
revert to partial inspection as soon as ejf < L. When the units in Zi are used 
up in the process of inspection, we continue, using the units of k , etc. 

If it is found inconvenient to take out a group of j units and then to select 

one unit for inspection, we could modify the sampling inspection plan as follows: 

Instead of taking out a group of j units and then selecting at random one unit 

from it, we select at random one unit from the uninspected part of the lot and 
look upon this unit as the unit selected at random from a hypothetical group of 

j units. Thus we can proceed exactly as before, except that we have to keep in 

mind that with each unit inspected under ‘‘partial inspection*’ we have used 

up another set of “ — 1 units. Thus, as soon as times the number of 

\mits inspected under “partial inspection” becomes equal to or greater than the 
number of units in the uninspected part of the lot, the inspection of that lot is 
already terminated, and we have to start using the units of the next lot. The 
inconvenience caused by the necessity of keeping track of the number of units 
inspected under “partial inspection” and of the number of xmits in the unin¬ 
spected part of the lot can be eliminated by further modifying the inspection 
plan as follows: Instead of beginning complete inspection as soon as > L, 
we continue “partial inspection” until J&j^r = — L is so large that complete 

inspection of ail the units of the lot not yet used up has to be made in order to 
bring Bn down to L at the end of the lot. This leads to the following sampling 
procedure, to be known as SPB: Let be the number of units in the lot, let 
Nt be the serial number of the last unit in the preceding lot, and let E{Nj) =* 
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= NLiejfj^ — L) be the ‘‘excess” carried over from the preceding lot. 
For simplicity assume that the following are all integers: 

LNo * M 

1 - / 
fNo = N* 


and 


fE(N0 
1 -/ 


E*. 


The inspection procedure is then as follows: Inspect successive units drawn at 
random until either 

(a) ilf * — E* defectives have been found in the first N' < N* units inspected. 

AT' . 

In this case inspect further an additional No — -j- units and this terminates the 

inspection of the lot. The excess to be carried over to the next lot is then zero. 
Or 

(b) N* units have been inspected and the number of defectives found H < 
M* — E*, In this case the inspection of the lot is terminated and the present 
negative excess 


E{Nl + No) = IH - (M* - E*)] 

is carried over to the next lot. (The serial number of the last element in the 
present lot is Nl + No and 


^(iVL+^’o) “ 


+ H 

nT+Wo 


Hence the present excess is 


{Nl + - L] = + H - LE\ - LNo 

= NLie„^ - L) + H - M ' 

= [H - M* + E*], 


as given above.) 

We note an important property of SPB: The excess carried over from a pre¬ 
ceding lot is never positive. 
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7. Possible modifications of the SP to achieve local stability. Althoui^ 
the sampling plans discussed in previous sections are optimum in the sense tiiat 
they guarantee the desired AOQL with a minimum of inspection when the 
production process is in statistical control, they do not always behave very 
favorably as far as local stability is concerned. To make this point clear, 
consider the following example: Suppose that during a very long initial time 
period the production process functions very well and the relative frequency 
of defectives produced is well below L. Thus, applying SPA, say, Cat — L will 
be considerably less than zero at the end of this period. Now suppose that then 
the production process suddenly deteriorates and the number of defectives 
produced during the next period of time is considerably higher than L. In spite 
of that, complete inspection will not begin for quite some time because ew became 
so small during the initial period. Thus there will be a long segment in the se¬ 
quence of outgoing units within which the relative frequency of defectives will 
be larger than the prescribed AOQL. Of course, this segment will be counter¬ 
balanced by other segments where the relative frequency of defectives will be 
below the AOQL, so that the AOQL will not be violated. Nevertheless, the 
occurrence of long segments with too many defectives, i.e., a lack of local star 
bility, is not desirable. 

It should be noted that, even though SPA was not designed to achieve con¬ 
siderable local stability, drastic lack of local stability cannot occur when the 
production process is in statistical control and SPA is employed. In the example 
given above where the outgoing quality was not locally stable, it was assumed 
that there were variations in the production process. The existence of statistical 
eontrol acts as an important stabilizing factor on the quality. 

In this section we want to discuss several possible modifications of SPA which 
will insure a greater degree of local stability. One such modification is the 
following: We choose a positive constant A and we define the excess E* for each 
value N as follows: E*{N) is equal to the excess E{N) as originally defined 
(= N[ey — L]) as long as for all N' < A, E{N*) > —A. The dif¬ 
ference E*(N + 1) - E*(N) = E{N + 1) - E(N) for all N for 
which EiN + 1) - E{N) > 0. If E{N + 1) - E{N) < 0, then E*{N + 1) = 
m&x[E*{N) -1- {E(N + 1) — E(N)\, —A]. In other words, with this modifica¬ 
tion of the sampling inspection plan we set a lower bound —A for the excess. 
When the excess is positive we begin complete inspection, and revert to partial 
inspection when the excess becomes non-positive. The effect of this is that, if 
the proportion of defectives produced becomes large, complete inspection will 
not be delayed very long, although the proportion of defectives produced in the 
preceding period may have been considerably below L, It is clear that this 
modification of SPA does not increase the AOQL. However, the amount of 
inspection will be somewhat increased, especially when the quality of the product 
is less than or only slightly greater than L. If the constant A is large, the in¬ 
crease in the amount of inspection is only slight, but also the degree of local 
43 tability achieved is not very high. On the other hand, if A is small, the increase 
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in the amount of inspection may be considerable, but a high degree of local 
stability is achieved. Thus, the choice of A should be made so that a proper 
balance between local stability and amount of inspection is achieved. 

Modifying SPA by setting a lower limit for the excess has the disadvantage 
that the mathematical treatment of this case is involved. We shall, therefore, 
consider another modification of the inspection plan which will have largely the 
same effect, but whose mathematical treatment appears to be much simpler. A 
fixed positive integer iVo is chosen and the inspection scheme is designed so that 
< 0 is assured. If is negative, we replace it by zero. In other words, 
no excess is carried over from the first segment of Ao units to the next segment of 
i^o units. Thus, the second segment of units is treated exactly the same way 
as if it were the first segment, and this is repeated for each consecutive segment 
of iVo units. This modification of SPA (the resulting plan is to be known as 
SPC) has essentially the same effect as setting a lower bound for the excess. 
Again it is clear that by this modification the AOQL is not increased, but the 
amount of inspection may be increased. The latter is particularly true when 
is small, which corresponds to very high local stability requirements. More 
efficient plans than SPC can probably be devised for this situation. 

Undoubtedly, there are many other possible modifications of the inspection 
plan by which a greater degree of local stability can be achieved at the price of 
somewhat increased inspection. It is not the purpose of this paper to enumerate 
all these possibilities or to develop a theory as to which of them may be con¬ 
sidered an optimum procedure. We shall restrict ourselves to a discussion of the 
mathematical consequences of SPC, First we define it precisely. If it is to be 
applied to inspection of lots of size Ao then SPC is simply SPB with E{Nl) 
and E* always zero. When applied to continuous production it will operate 

fM 

as follows: Assume for convenience that M = LNo , N* = fNo , and 
are all integers. 

(a) Begin each segment of Ao units with partial inspection, i.e., inspect one 


Continue partial 


unit chosen at random from each successive group of y units. 

inspection until one of the following events occurs: either 

(b) ilf * defectives are found. In this case begin complete inspection with the 
first unit which follows the group in which the last of the M* defectives was 
found and continue imtil the end of the segment of Ao units. 

or 

(b') N* groups of j units are partially inspected. 

(c) Repeat with the next segment of No \mits. 

Comparison with SPB shows that, in SPC, if (b) occurs earlier or at the same 
time as (b'), then E^q = 0, while if (b') occurs before (b) we have E^o < 0. 
In contradistinction to SPB, in SPC there is no carrying over of the excess. 
Let us determine the AOQ for SPC when the production process is in a state 
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of statistical control. Denote by p the probability that a unit produced will be 
defective. Let the chance variable H denote the number of defectives found 
during partial inspection. The probability that H ^ i < M* is 

{T)p‘a - 

H < M* always. We have, when H ^ 

Em = 

and hence 


Noejff, 


(1 -J)i 

f 


The AOQ is therefore - 

jNo 

therefore 


multiplied by the expected value of H and is 



The reduction from the original quality p to the AOQ w^ achieved by inspecting 

a fraction of units which is - times the reduction in the frequency of defectives, 

P 

Hence, with probability one, the fraction of units inspected when the production 
process is in statistical control is 

‘ §' <"• - (f) f'o - 


When p > 


^ we see from Section 4 that the third term of the right member 


1 

of (7.2) represents the price paid in fraction of inspection above the minimum in 


return for the local stability achieved. 


When p < 


1 


L 


the additional inspec¬ 


tion is of course I — f. 

As iV’o becomes larger, SPC becomes more and more like SPA, and conse¬ 
quently the amount of inspection tends to the minimum. As No becomes 
smaller, the degree of local stability achieved becomes higher and must be 
paid for by an increasing amount of inspection. An illustrative example will be 
given in the next section. It has already been pointed out that the mere exist¬ 
ence of statistical control implies a considerable amount of local stability even 
when SPA is applied. 
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The only practical difficulty which may arise in evaluating the formulas in 
(7.1) and (7.2) might come from attempting to evaluate 

r = (M* - i) p‘(i - 


For those values of the parameters which are likely to occur in application, a 
good approximation to T" (exactly how good we shall not investigate here) is 
given by 


T = E {M* - i) - 


•"'"'’(ivv)' 

i\ 


A table of T for integral values of M* from 2 to 16 and for integral values of N*p 
from 1 to 25 is given below. The computations were performed under the 
direction of Mr. Mortimer Spiegelman of the Metropolitan Life Insurance 
Company, to whom the authors are deeply obliged. 


Table of T 




E (.M* 

«i-0 


i) 


i\ 


N*p 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

1 

1.10 

.54 

.25 

.11 

.05 

.02 

.01 

.00 

.00 

.00 

.00 

.00 

2 

2.02 

1.22 

.67 

.35 

.17 

.08 

.04 

.02 

.01 

.00 

.00 

.00 

3 

3 v 00 

2.08 

1.32 

.78 

.44 

.23 

.12 

.06 

.03 

.01 

.01 

.00 

4 

4.00 

3.02 

2.13 

1.41 

.88 

.52 

.29 

.16 

.08 

.04 

.02 

.01 

5 

5.00 

4.01 

3.05 

2.20 

1.49 

.96 

.59 

.35 

.20 

.11 

.06 

.03 

6 

6.00 

5.00 

4.02 

3.08 

2.26 

1.57 

1.04 

.66 

.41 

.24 

.14 

.08 

7 

7.00 

6.00 

5.01 

4.03 

3.12 

2.31 

1.64 

1.12 

.73 

.46 

.28 

.17 

8 

8.00 

7.00 

6.00 

5.01 

4.05 

3.16 

2.37 

1.71 

1.19 

.79 

.51 

.32 

9 

9.00 

8.00 

7.00 

6.00 

5.02 

4,08 

3.20 1 

2.43 

1.77 

1.25 

.85 

.56 

10 

10.00 

9.00 

8.00 

7.00 

6,01 

5,03 

4.10 

3.24 

2.48 

1.83 

1.31 

.91 

11 

11.00 

10.00 

9.00 

8.00 

7.00 

6.01 

5.05 

4.13 

3.28 

2.53 

1.89 

1.37 

12 

12.00 

11.00 

10.00 

9.00 

8.00 

7.01 

6.02 1 

5.07 

4.16 

3.32 

2.58 

1.95 

13 

13.00 

12.00 

11.00 

10.00 

9.00 

8.00 

7.01 

6.03 

5.08 

4.19 

i 3.36 

2.63 

14 

14.00 

13.00 

12.00 

11.00 

10.00 

9.00 

8.00 

7.01 

6.04 

5.10 

4.22 

3.40 

15 

15.00 

14.00 

1 

13.00 

12.00 

11.00 

10.00 

9.00 1 

8.01 

7.02 

6.05 

5.12 

4.25 


8, The SP of H. F. Dodge. H. F. Dodge [1] has proposed a very interesting 
SP for continuous production. The plan is defined by two constants i and / 
and may be described as follows: Begin with complete inspection of the units 
consecutively as produced and continue such inspection until i units in succes¬ 
sion are found non-defective. Thereafter inspect a fraction / of the units. 
Continue partial inspection until a defect is found. Then start complete inspec¬ 
tion again and continue until i units in succession are found non-defective. 
Repeat the procedure. 

Dodge [1] derived formulas for determining the AOQL corresponding to any 
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pair i and ff under the assumption that the production process is in a state of 
statistical control. Dodge’s formulas for the AOQL are not necessarily valid 
if we do not make this restriction on the production process, i.e., if we admit 
that the probability p that a unit will be defective may vary in any arbitrary 
way during the production process. This, of course, is not a criticism of the 
derivation of the formulas; it cannot be considered surprising that a formula is 
not valid under assumptions dijfferent from those under which it was derived. 
However, it is relevant to point out the fact that the Dodge SP does not guaran¬ 
tee the AOQL under all circumstances, so that care must ^ taken to ensure that 
certain requirements are met. Exactly what these requirements are is not 
known; statistical control is a sufficient condition, but is probably not necessary 
and could be weakened. It seems likely to the authors that, if p varies only 
slowly (with N) with infrequent “jumps,” the Dodge SP will produce results 
which will exceed the AOQL by little, if at all. But if the “jumps” are numer- 




Table of T ^ Ys W “ 

•-0 


i) 




(Continued) 



13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

21 

25 

i 

1 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

2 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

3 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

4 

.01 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

5 

.02 

.01 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

6 

.04 

.02 

.01 

.01 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

7 

.10 

.05 

.03 

.02 

.01 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

.00 

8 

.20 

.12 

.07 

.04 

.02 

.01 

.01 

.00 

.00 

.00 

.00 

.00 

.00 

9 

.36 

.23 

.14 

.08 

.05 

.03 

.01 

.01 

.00 

.00 

.00 

.00 

.00 

10 

.61 

.40 

.26 

.16 

.10 

.06 

.03 

.02 

.01 

.01 

.00 

.00 

.00 

11 

.97 

.66 

.44 

.29 

.18 

.11 

.07 

.04 

.02 

.01 

.01 

.00 

.00 

12 

1.43 

1.02 

.71 

.48 

.32 

.20 

.13 

.08 

.05 

.03 

.02 

.01 

.01 

13 

2.00 

1.48 

1.07 ' 

.75 

.52 

.35 

.23 

.15 

.09 

.06 

.03 

.02 

.01 

14 

2.68 

2.05 

1.54 

1.12 

.80 

.55 

.38 

.25 

.16 

.10 

.07 

.04 

.02 

15 

3.44 

2.72 

2.10 

1.59 

1.17 

.84 

! .59 

.41 

.27 

.18 

.12 

.07 

.05 


ous and appropriately spaced it is possible to exceed the AOQL by substantial 
amounts, as the example below will show. The Dodge plan was intended to 
serve as an aid to the detection and correction of malfunctioning of the produc¬ 
tion process and this use w^ould tend to prevent the occurrence of such a phenome¬ 
non. Parenthetically, it should be remarked that the information obtained in 
the course of inspection according to either the plans discussed in this paper or 
any reasonable scheme should, if possible, be sent at once to the producing 
divisions for their guidance. 

An example to show that the AOQL can be exceeded can be constructed as 
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follows: Let i «= 54 and / ~ 0.1. Then according to the graphs of [1], page 
272, the AOQL should be 0.02. Define a sequence of 60 successive units free 
of defectives as a segment of type 1, and a sequence of 60 successive units where 
the production process is in statistical control with p = 0.1, as a segment of type 
2. Suppose that the sequence of units produced consists of segments of types 
1 and 2 always alternating. Then it follows that the first item inspected in a 
segment of type 2 is always inspected on a partial inspection basis. We now 
assume that, unless the occurrence of a defective has previously terminated 
partial inspection, the Ist, 11th, 21st, 31st, 41st, and 51st items in a segment 
of type 2 will be chosen for partial inspection, and if the Ist item is found defec¬ 
tive, the entire segment of type 2 will be cleared of defectives. (Both of these 
assumptions favor the Dodge SP.) Then the situation is as described in the 
following table: 


( 1 ) 


( 2 ) 


(3) 



Probability of first 
terminating partial 
inspection at 
each item 

Ex-pected number of defec¬ 
tives remaining in se^- 
ment of type B after 
partial inspection 
has been ter¬ 
minated 

(1) X (2) 

Ist 

.1 

0 

0 

nth 

(.9) (.1) = .09 

.9 

.081 

21st 

(.9)5(.l) = .081 

1.8 

.1458 

31st 

(.9)’(.l) = .0729 

2.7 

.19683 

4l8t 

(.9)‘(.l) = .06561 

3.6 

.236196 

51st 

(.9)‘(.l) = .059049 

4.5 

.2657205 

Expected number of defectives 

Probability that an entire left in a segment of type B 

segment of type B will which has been inspected 

be partially inspected only partially 

Product 

(.9)* = 

.531441 5.4 

2.8697814 


Sum = 3.7953279 

3 7953279 

The AOQ is therefore - — = .0316+, while L = .02. 

l^U 


It is therefore difficult to compare the Dodge plan with any of the plans de¬ 
scribed in this paper with respect to their effect on a production process not in 
statistical control. If the production process is in statistical control, then, as we 
have already seen, SPA requires minimum inspection (and, incidentally, because 
of the existence of statistical control, produces a fair degree of local stability). 
If, when statistical control exists, one requires both maintenance of a given 
AOQL and a higher degree of local stability than is produced by SPA, the rele¬ 
vant comparison is between the Dodge plan and SPC. Both will probably give 
good results as regards local stability, but it is not possible at present to make 
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these intuitive notions precise, as we have not given an exact definition of local 
stability. The following example (in which statistical control is assumed) may 
not be unrepresentative of what the situation is with regard to the amount of 
inspection required. 


Fraction of product inspected under the Dodge plan and under SPC when 
L = .046 ' / = .1 


V 

Fraction of product 
inspected under the 
Dodge plan 

Fraction of product inspected under SPC when 

No - 400 

No - 1000 

No - 2000 

.01 

.12 

.12 

.10 

.10 

.02 

.15 

.17 

.11 

.10 

.03 

.19 

.22 

.14 

.11 

.04 

.23 

.28 

.19 

.16 

.05 

.28 

.34 

.26 

.21 

.06 

.33 

.40 

.33 

.29 

.07 

.39 

.45 

.39 

.37 

.08 

.45 

.50 

.46 

.44 

.09 

.52 

.54 

.51 

.60 

.10 

.58 

.57 

.55 

.55 


The decrease in inspection required by SPC as No increases is evident in this 
table. When No = 2000 SPC requires less inspection than the Dodge plan, 
when No = 400 it requires more inspection than the Dodge plan. How the 
various degrees of local stability achieved compare remains an open question. 
The case when No = 400 probably lies in the region where SPC is inefficient 
(as regards amount of inspection) and corresponds to a high degree of local 
stability. 

We note that both plans call for increased inspection as the quality worsens 
(p increases). If the manufacturer is required to pay for the inspection this 
serves as an added incentive to improve quality of output. 
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THE EXPECTED VALUE AND VARIANCE OF THE RECIPROCAL AND 
OTHER NEGATIVE POWERS OF A POSITIVE BERNOULLIAN 

VARIATE' 

By Frederick F. Stephan 
War Production Boards Washington 

1. Introduction. The expected value of the reciprocal of a Bernoullian 
variate appears in certain problems of random sampling wherein both practical 
considerations and mathematical necessity make zero an inadmissible value 
of the variate. This special condition excluding zero is necessary from a practical 
standpoint because statistics can not be calculated from an empty class. It is a 
necessary condition, in the mathematical sense, for the expected value, and 
variances involving it, to be finite. When subject to this condition the Bernoul¬ 
lian variate will be designated the positive Bernoullian variate. 

There appears to be no simple expression for the expected value of the recip¬ 
rocal such as there is for the expected value of positive integral powers of the 
positive Bernoullian variate. This paper presents in (16) a factorial series, 
which can be computed conveniently to any desired number of terms by means 
of the recursion relation (18). Upper and lower bounds on the remainder may 
be computed readily from (20), (21), (23), (24), and (26) and the approximation 
may be improved by adding an estimate of the remainder taken between these 
bounds. A factorial series for the expected value of negative integral powers 
is given in (34). A factorial series for the expected value of the reciprocal of the 
positive hypergeometric variate is given in (53). Series for the variances follow 
directly from the series for expected values. 

A simple example of the sampling problems in which this expected value 
appears is presented by the following instance of estimates derived from samples 
of variable size: 

An infinite population consists of items of two kinds or classes, A and B. 
Lots of N items each are drawn at random. In such lots the number of items, 
x\ that are of class A is an ordinary Bernoullian variate. Next, every lot 
composed entirely of items of class B is discarded. This excludes all lots for 
which x' = 0. From each remaining lot the N — x' items of class B are set 
aside, leaving a sample composed entirely of items of class A. The number of 
such items, a:, varies from sample to sample. It will be designated a positive 
Bernoullian variate since a: = x' if x' > 0 and x does not exist if x' < 0. Finally, 
let there be associated with each item in class A a particular value of a variable, 
2 /, the variance of which in A is <r*. Then if the mean value of y is computed for 
each sample, the error variance of such means is E{a^/x) = a^E{l/x). 

Instances similar to that just described occur in the design of sampling surveys 
from which statistics are to be obtained separately for each of several classes 

^Develoi>ed from a section of a paper presented to the Washington meeting of 
the Institute of Mathematical Statistics on June 18, 1943. 
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of the population, i.e., each statistic is to be computed from some part of the 
sample instead of all of it. They also occur in certain sampling problems in 
which some of the items drawn for a sample turn out to be blanks. 

A related problem concerning the error variance of the proportion of males 
among infants born in any one year was considered by G, Bohlmann in a paper 
on approximations to the expected value and standard error of a function [1], 
His approach to the problem was to expand the function in a Taylor series and 
take the expected value of each term. The conditions under which the resulting 
series converges were developed for certain functions of a Bemoullian variate. 
The present paper provides a different and, in certain respects, superior approach 
to the problem employing a method due to Stirling [2]. While the method is 
applied to the reciprocal and negative powers it is also applicable to certain 
other functions of a Bemoullian variate. 


2. The positive Bemoullian variate. Let a; be a random variate defined by a 
Bemoullian probability function subject to the special condition x > 0. The 
probability of a; in n is 

(1) P(x) = - 9") 


where x and n are integers, 1 < x <n, and 


( 2 ) 


/n\ _ n\ 

\x) x\{nx)V 


The probabilities p and q are constants, 0<p = l — gr<l. 

The divisor 1 — g” arises from the condition excluding zero. (Bohlmann 
omits this factor, assuming that is negligible, an assumption that is not 
always valid. In fact, g" An extension of this condition to exclude 

all values of x less than a specified constant will be considered in a later section. 

Throughout this paper summation is understood to be from a: == 1 to x = n 
unless it is shown othermse. 


3. Expected values and moments. The expected values of x and its positive 
integral powers are 

(3) E{x) = np/iX - g") 

(4) E(.x^) = (np9 + ny)/(l - 9“) 

and, in general 

(6) £(x‘) = .,/(! - 9") = ^ (”) > 0 

where vi is the ith moment about zero of an ordinary Bemoullian variate with 
the same n and p and the are the Stirling numbers of the second kind (see 
Table 1). 

The moments about E{x) are somewhat more complicated than the corre- 



52 


FREDERICK F. STEPHAK 


spending moments of the ordinary BernouUian variate. For example, the / 
variance 

( 6 ) 

and the third moment 

(7) E\(x - E(x)?\ = (g ~ P)”Pg _ w*p*g" (1 + g") 

(i) £,Ka: Ji(x)) J ^ . 

The moments about np, the first moment of an ordinary BernouUian variate, 
are 

(8) E{(x - npY} = (m. + (-l)‘‘“^(np)V)/(l - 

TABLE 1 


Stirling numbers of the second kind, 0i 



1 

2 

3 

1 

4 

6 

6 

1 

1 

0 

0 

0 

0 

0 

2 

1 

1 

0 

0 

0 

0 

3 

1 

3 

1 

0 

0 

0 

4 

1 

7 

6 

1 

0 

0 

5 

1 

15 

25 

10 

1 

0 

6 

1 

31 

90 

65 

15 

1 

7 

1 

63 

301 

350 

140 

21 

8 

1 j 

127 

966 

1,709 

1,050 

266 

9 

1 

255 

3,025 

7,770 

6,951 

2,646 

10 1 

1 

511 

9,330 

34,105 

42,525 

22,827 


where is the ith moment, about the mean, of an ordinary BernouUian variate 
with the same values n and p. 


The expected value of the reciprocal is 


(9) 





+ - Dp*5* * 



P Q 


^ I 1 ’ 

+ * * * + - P 
n 


This equation is not suitable for the computation of E(l/a:) to a satisfactory 
degree of approximation unless np is small, say less than 5 for most purposes. 
The number of terms necessary to obtain a computed value with four significant 
figures, for example, may be estimated to be approximately S\/npq/{ — q^). 
Expressed as a function of g, E{l/x) becomes 


( 10 ) 



_ 2 g g 

1 — g»» n — x+1 


a series which may be convenient for small values of g. 

E(l/x) may be expanded in a power series by Taylor’s Theorem. It may 
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also be expanded in a finite series of expected values of powers, either in E{x)^. 
• • • or in E{x — c), E(x — c)*, • • • c being any positive constant. The 

second of these three series may be obtained by expanding ^ and taking. 


second of these three series may be obtained by expanding “y taking. 

expected values, and the third by dividing out - = -r and taking ex- 

pected values. For all three expansions, however, the terms become progres¬ 
sively more complicated and laborious to compute. A simpler and more con¬ 
venient series for actual computations may be obtained by expanding 1/x in a 
factorial series. 

4. Expansion of E{l/x) in a series of inverse factorials. It is easy to prove 
by induction that, x > 0, 

1 ^ Ot 1! 4- ... 4- 

(11) ^ + 1 (a; + l)(x + 2) (x+i)! 

+ ■■■+ (x + f)\ 

where • 

(12) Rt(x) = tlix - l)!/(x + t)l 

is the remainder after the first t terms. This is, of course, an expansion in 
Beta functions. It is also a simple special case of the expansion of a function 
in a “faculty series” or series of inverse factorials [3] with an exact expression 
for the remainder. 

Let 

Then, since 

-(T+TilU”® " (n + fliV 

the expected value of (11) is 

e (^ = 4- H8» + ... -i- 

\x/ (n 4-1 )p (n + l)(n + 2)p2 (n4-l)!p* 


n!8< (1 — g") 

(n 4- i)ip' 


\J i V»| I A . , 

(n 4-1 )p (n 4- l)(n 4- 2)j>^ 


a — l)!n!8i 


(n 4- 1)! p* 


When developed as infinite scries, both (11) and (15) are convergent since the 
remainders /^^(x) 0 as ^ . 

For computing purposes it is convenient to write 


(16) 

in which, since 


^0 = 

/n + i — l\ p' 
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the foUowii^ recursion relation exists between and «,_i 

_ (i — l)In!s< _ (t - - k/i ^ 

* (n + i)lp* (n + i)p ’ ’ 

_ 1-ifc 
(n + l)p 

where 

(19) k = npgV(l ““ 9 ”) np/(e"*’ — 1). 

This reduces the computing of the Ui to a simple repetitive procedure. The 
computing is still simpler in those problems in which, for the degree of precision 
desired, k is negligible. 

An estimate of i&(/2<(x)) should be added to the sum in (16) to improve the 
approximation. To determine a suitable estimate, a lower bound for the ex¬ 
pected value of the remainders may be computed from one of the following 
inequalities: 




. - 

\m 


re — w , (a: 

-j- . 


(f - l)!a:! 

(* + <)! 


.1, Ij// i\ 

^ m l)w<—1 H tUi , 

m tnr mr 


m 9 ^ 0 


which is maximized by setting m == {(< — l)wt-i — tut]/uty whence 
(21) E{R,{x)) > <«?/{(< - l)w,_, - lu,}, t > 1. 

Also, since when m = E{x) 


(22) S(a; - m) P(x) < S(a; - m)P{x) = 0, 

a simpler inequality is 

(23) E(Ri{x)) > tui{l - q'')/np. 

Further, if only the first c < n terms in (20) are taken, 

(M) 


^ (<+1)9 

Doimd may be computed from 


j (rr “ l)(n — rc + l)p 


tUt 

(26.1) 

1 4 .1 

(26.2) 

1 . _L 2 ,1 

giU.+ g.'.+ gP* 

(26.3) 

3 *-i \» 3/ 

(26./) 
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the choice among which may be governed by computing convenience. Taken 
with (16), these inequalities provide lower and upper bounds for E{\/x), 


6. Examples. Two examples will serve to illustrate the factorial series (15). 


Example 1 

Computation of E{l/x) for n = 100 and p = 0.1 


np = 10 k ^ .000,265,621 E{\) = .111,527 


t 

Binomial 
sum of t 
terms 

Sum of t 
terms 

Factorial 
series lower 
hounds* 

Upper 

hound** 

1 

.000,295 

.098,984 

.099,647 

.132,167 

2 

.001,107 

.108,675 

.109,006 (.111,034) 

.116,247 

3 

.003,071 

.110,548 

.110,752 (.111,313) 

.112,498 

4 

.007,039 

.111,082 

.111,223 (.111,381) 

.111,852 

5 

.013,813 

.111,280 

.111,385 (.111,452) 

.111,657 

6 

.023,743 

.111,370 

.111,452 (.111,478) 

.111,587 

7 

.036,442 

.111,416 

.111,483 (.111,489) 

.111,556 

8 

.050,796 

.111,444 

.111,500 (.111,497) 

.111,544 

9 

.065,287 

.111,461 

.111,509 (.111,503) 

.111,537 

10 

.078,474 

.111,472 

.111,514 (.111,508) 

.111,534 

11 

.089,372 

.111,481 

.111,518 (.111,511) 

.111,532 

12 

.097,604 

.111,487 

.111,520 

.111,530 

13 

.103,320 

.111,492 

.111,521 

.111,529 

14 

.106,985 

.111,495 

.111,523 

.111,529 

15 

.109,164 

.111,498 

.111,524 

.111,529 

16 

.110,369 

.111,501 

.111,524 

.111,528 

17 

.110,992 

.111,503 

.111,525 

.111,528 

18 

.111,294 

.111,505 

.111,525,4 

.111,527,5 

19 

.111,431 

.111,506 

.111,525,6 

.111,527,3 

20 

.111,489 

.111,508 

.111,525,8 

.111,527,1 


24 .111,526 


100 .111, 527 (end of series) 

* Sum of t terms plus lower bound for E{R{x)) from (24) with c = 3. Num¬ 
bers in pai-entheses are calculated from (21). 

** Sum of t terms plus upper bound on E{R{x)) from (26.3). 


t 

1 

2 


Example 2 

Computation of E{t/x) for n = 1000 and p — 0,3 


np — 300 

Sutn of I terms 

.003,330,003,330 

.003,341,081,185 


k = 9.7 X 10-14 

Factorial series upper and lower bounds* 

f.003,346,7 

003,341,0 (.003,341,155.4) 


* Computed as in Example 1. 
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( Sum of t term* 

^ .003,341,154,817 

.003,341,155,649 

^ .003,341,155,559 


Factorial aerie* upper and lower bounds* 
.003,341,211 
.003,341,155 

.003,341,156,29 

\.003,341,155,56 

r.003,341,155,58 
003,341,155,57 


For the binomial series, the sum of the largest eight terms of (9), not the 
first eight terms, is approximately ,0007 which is less than 1/4 of the 
value of E{l/x). 


In the first example the value of np is almost small enough to make computation 
by (9) convenient. In the second example about 120 terms of (9) must be com¬ 
puted to obtain an approximation to four significant figures but only four terms 
of the factorial series are needed to obtain seven significant figures. It is evi¬ 
dent that as np increases, the number of terms of (16) required to obtain an 
approximation to a given number of significant figures decreases. The opposite 
is true of (9) as n increases, or as p approaches a* value near 1/2. 


6. Extending the special condition. In some sampling problems all values 
of X less than a specified value, gf, and greater than another specified value, h, 
are inadmissible. Then the probability of a: in n is 

(27) P{x \g,h)^ p* g-Vso.,.*, 9 <x<h, 

where 


(28) 





p q 


With this new condition, E{l/x) is given by (15) if is replaced by 


(29) 




/n + p q 

\X “f* i/ So,g^ 


and the summation in the remainder term is from g to h. Also since 
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a recursion relation similar to (18) may be used in computing 
(31) 


(n + i )! p* 

= (i - Dm-i.-,.* - (t - 1)1 {K/ia + * - i)l + W(ft + 

(n + i)p 


where 

(32) 

(33) 


' (n-g)!«o,,^ 

, ^ n!p*g*~*^* 

* (n — A) 1 «o.*A 


The inequalities (20) to (23) inclusive and (26) are applicable to this extension 
on substitution of Ui,o,h for Ui . 

7. Expansioii of E{x~'') in a factorial series. Equation (11) may be extended 
to other negative integral powers of x. If a is a positive integer 


(34) 

where 

(35) 


Ei.-) - siPW = + 


+ ■ +few 


X^-^X\P(X) 

Ri (x) - 1 j bt+u 




and the b<,y are the absolute values of the Stirling numbers of the first kind (see 
Table 2) formed by the recursion relation 

(36) bi.j - (i — l)bi-i.j, hj - 0 if j>i or j < 1. 

It is evident that 

(37) 


(38) 

whence 


(39) 




{i — 1)1 and bij < t! if j > 1, 


Bid) = 


f + 1 


P(l) 


. at + 1)! - «!)*!P(X) 
"‘W <-2(iT f)i-’ 


X > 1 


(t + 1 ) 


P(x). 




58 


FREDERICK P. STEPHAN 


Hence R[{x) 0 and E{R[{x)) —► 0 as t » « and the sum of the first t terms of 

(34) converges to as < oo. 

The follovidng recursion relation corresponding to (18) provides a simple proce¬ 
dure for computing; 


(40) 


Ui,a = hi,aUi/{i — 1 )! = 


^ (M<_l,a/6i-l.«) ““ k/i\ 

(n + l)p 


The computing procedure, then, follows a cycle of four simple operations; 

1. Divide {k/{i — 1)1) by i. 

2. Subtract the quotient from {te»>-i,a/5i_i,a). 

3. Divide the difference by {(n ■+• t + l)p} + p. The quotient is Ui,a/hi,a . 

4. Multiply this quotient by . 


^ TABLE 2 


Absolute values of Stirling numbers of the first kind, b,-. y* 


\ 

\ J 
.\ 

* \ 

1 

2 

3 

4 

5 

6 

1 

1 

0 

0 

0 

0 

0 

2 

1 

1 

0 

0 

0 

0 

3 

2 

3 

1 

0 

0 

0 

4 

6 

11 

6 

1 

0 

0 

5 

24 

50 

35 

10 

1 

0 

6 

120 

274 

225 

85 

15 

1 

7 

720 

1,764 

1,624 

735 

175 

21 

8 

5,040 

13,068 

13,132 i 

6,769 

1,960 

322 

9 

40,320 

109,584 

118,424 

67,284 

22,449 

4,536 

10 

362,880 

1,026,576 i 

1,172,700 

723,680 

269,325 

63,273 


* These numbers are also known as differential coefficients of zero [4]. 


The expressions in braces arc quantities obtained in the preceding cycle. 

The Ui,a may also be calculated from (18), or checked by such a calculation. 
A lower bound for E{R'{x)) after t terms may be calculated from the first c 
terms of 

Emx)) = 2] R',{x)Pix) > t, R\{x)P{x) 

1 

(41) 

iZi iA x‘~’+^(x -f t)!(n — x) 1(1 — q”) 


or from an inequality similar to (23) 
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which may also be written 


(43) 


> (( - mS(x))‘*^ + < - 1) • • • £(x) 

*+l 

- z b,^i.Amy}. 


An upper bound may be calculated from 


(44) ff(R'(^)) < + 1)«< 

(t — 1)! i-i 

f 

or 

® " ® x^P(x) 

E{R'(x)) < Z E'(x)P{x) + E E 

*-l *-c+l j-l “T tjlC 

(45) < Z R'(x)P(x) + --5^, Z^, = Z R'(x)P(.x) 

x-1 (i — 1; I y-i C® *-i 

Ut ( ."i 

+ it - 1)10“+* |(c + 0(c + « - 1) • • • C - 5,+i.ic'j. 


8, The positive hypergeometric variate. The theory of sampling without 
replacement from a finite population rests on the hypcrgeometric variate. Its 
probability function is 

(46) 

In applications to finite sampling, N is the number of items in the population, 
M is the number of them that are of a certain kind, n is the number of items 
drawn for the sample, and x is the number of items of the designated kind in the 
sample. 

As in the case of the BernouUian variate, it is necessary to exclude zero in 
defining the expected value of 1/x, The probability function of the positive 
hypergeometric variate, then, is 

(47) Ph{x) = P{x I N, My n)/so , x > 0 
where 


(48) 50 = 1 - P(0|A, My n). 

Throughout this section the notation will have reference to (47) instead of (1). 
The expected values of positive integral powers of x are 

E(x) = Mn/iNso) 


(49) 

(50) 


] /M(Jlf - l)n(n - 1) , Mri 




nI 
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and, in general, 
(51) 


E(,x*) = Z ®{E(x]/{x - M 

y-1 


where the ©J are the Stirling numbers of the second kind and 


(52) 


E 




M!n!(N - j)! 


(M — j )! in — j)\N\8Q * 
The factorial series corresponding to (16) is 


(53) 
where 

(54) 
and 

(55) 


E 


(^) = L J. P«ix) = + E{R,ix)) 

\x/ X <_i 


„ _ V (* ~ 1 )*®* D 

V>i w f U\^) 

(x + t)l 


E(R,iT)) 


2 --U! P„{x). 

(x + <)! ^ 


The Ui may' be computed from 
(Af + l)si 


Ml 


(56) 


(M + l)(n 4- l)so 




N+ 1 


(N - M)\(N - n)\ 


So\{M + l)(n + 1) 
and the recursion relation 


(57) 
where 

(58) 


Ui 


N\iN - M 
(N + i)si 

(M + i)(n + i)si-i 


n - 1)!(M + l)(n + 1) 


Ui-i 


Si = I — ^ P{x \ N + i, M + iy n + 0- 


The computing is quite simple in those instances in which 1 — is negligible. 

Corresponding to (26), an upper bound for the expected value of the re¬ 
mainders after t terms may be computed from 


tUt 


(59) 


hiut + iP^(i)/(i + 1) 

E{R,{x)) < + 3 f + I + 0 ^ 2 ) 

[j tri\x j; 


(x + or 


(59.1) 

(59.2) 

(59.3) 

(59.J-) 
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A lower bound for the expected value of the remainders may be computed 
from one of the following inequalities corresponding to (23), (21) and (24) 

(60) E(,R,ix)) > tutNSi/{Mn) 

(61) EiRt(,x)) > <u!/|(< - l)u^i - tut\ 

(62) Eilttix)) > i Pm{x). 


The expected values of other negative integral powers of the positive h 3 rper- 
geometric variate may be calculated from 


(63) 
where 

(64) 


^(O = L h<,.u./(t - 1)1 + E(,R[{x)) . 


Rii^) 


a 


7-1 


x^-^x\Ph(x) 
7f{x + 01 * 


With Ph{x) substituted for P(x), (39), (42), (43), (44), and (45) provide lower 
and upper bounds for E{R[{x)) for the positive hypergeometric variate. Also, 
corresponding to (41) 


(65) 


EiR^x)) > i R\{x)Ph{x), 




9. Variance and moments of l/x and a;“". The variance of l/x, which is 
E{\/x^) — {E{\/xyf, may be calculated from (16) and (34), with a = 2, for the 
positive Bernoullian variate, and from (53) and (63), with a = 2, for the positive 
hypergeometric variate. Likewise, the variance of x"® and the moments of 
l/x and about E{\/x) may be computed by the usual formulae. 
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RANDOM WALK IN THE PRESENCE OF ABSORBING BARRIERS 

M. Kac 

Cornell University 

1. Introduction. The problem of random walk (along a straight line) in the 
presence of absorbing barriers can be stated as follows: 

A particle, starting at the origin, moves in such a way that its displacements 
in consecutive time intervals, each of duration can be represented by inde¬ 
pendent random variables 

Xi,X2,Xa, ... 

Moreover, if at some time the total (cumulative) displacement becomes >p 
-(p > 0) or < ~ $ (^ > 0) the particle gets absorbed. The problem is to deter¬ 
mine the probability that ‘‘the length of life^’ of the particle is greater than a 
given number t. This problem also admits an interpretation in terms of a game 
of chance in which the player quits when he loses more than q or wins more than 
p. An interesting paper on this type of problem by A. Wald^ appeared recently 
in the Annals, Wald assumes that the X’s are identically distributed and that 
their mean and standard deviation are different from 0.^ He is then mostly 
interested in the limiting case when both the mean and the standard deviation 
become small. The object of this paper is to propose a different method of 
attack which in some cases leads to an answer in closed form. The method we 
use has been employed repeatedly in statistical mechanics in the study of the 
so called order-disorder problem. It is due, I believe, to E. W. l^lontroll^ As 
far as the author knows this method was never used in connection with the 
classical probability theory and this seems to furnish an additional reason for 
publishing this paper. 

2. The simplest discrete case. We assume that each X is capable of assuming 
the values 1 and —1 each with probability and for simplicity sake we let 
Ai = 1. Note that, unlike in Wald’s case, the mean of X is 0. Denote by N 
the random variable which represents the “length of life” of the particle and 
let (m an integer) 


Sim) = ^ 


m = 1 or 
otherwise. 


m = ~1, 


‘ A. Wald ‘*On cumulative sums of random variables,” Annals of Math. Stat., Vol. 15 
(1944), pp. 283-296. 

* Since this was written Professor Wald informed the author that he can easily avoid the 
condition that the mean should be zero. 

* See for instance £. W. Montroll, “Statistical Mechanics of nearest neighbor systems,” 
Jour, of Ckem. Physics, Vol. 9 (1941), pp, 706-721. 
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Clearly we have (throughout this section we assume that both p and q are 
integers) 

Prob. {iV > n} = Prob. {-g < < p, -g < + ^2 < p, • • • , -g 

< + * • • + Xb < p} == X8(mi)5(int) • • • 5(mn), 

where the summation is extended over all integers ttii , m 2 , • • • mn for which 
-g < mi < p, ~g < mi + m2 < p, • • • , -g < mi + m2 + • • • + mn < p. 
Letting 

= g + mi + • • • + my, (j = 1,2, • • • , n), 

we see that 

(1) Prob {JV > n| = 2 «a, - q)S(h -h) ■■■ S(h - l^i). 

Let us now consider the (p + g + 1) by (p + g + 1) matrix 

^0 i 0 0 0 • 

i 0 i 0 0 • 

(2) A = mi - m = 0 i 0 i 0 . 


It is easily seen that the sum in (1) is equal to the sum of the elements in the 
(g + l)-st colunm (or row) of the matrix A". Thus 
Prob. [N > n) = sum of the elements of the (g + l)-st column of A”. 
Denote by Xi, X 2 , * • • the eigenvalues of the matrix A and let 

be the normalized eigenvector of A belonging to the eigenvalue Xy. It can be 
shown by elementary means* that 


Xy = cos 


P + g + 2 


* Matrices of type (2) have been introduced and studied in various connections. In a 
paper by R. P. Boas and the present author recently accepted by the Duke Mathematical 
Journal references to several authors are given. In order to find the eigenvalues and the 
eigenvectors of (2) it suffices to know that 


1 o 0 
ala 
0 a 1 a 
0 0 a 1 


m+l _ »+l 

Pi P2 

PI — pa 


where m is the order of the matrix pi and roots of the equation p* — p -f a* » 0. 
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and 




V2 


, Sm -i-jj. 

■\/p + q +2 p + q +2 
Denoting by R the orthogonal matrix 


trjk 



xi» 

* • * ^p+fl+1 

Xi^ 


-<2) 

• * * ^p+a+1 



. ^(P+fl+l) 

^p+fl+i 


and by 72' the transposed of R we have (since the eigenvalues of A are simple) 
by a well known theorem 

ol 


xr 


R. 


It thus follows by an easy computation that the sum of the elements of the 
(g + l)-st column (row) of it"* is 


p+C+l p+q+l p+fl+l / p+c+1 \ 

S ± B x?4i\( £ x‘«). 

r-l j -1 j -1 \ r«l / 


We have 

p+o+l 


V2 


P+fl+l 


sm 


’ Tjr 


Vp + q+"2 ^ ‘^p + q + 2 

0 , 

V2 


Vp + g + 2 2(p + g + 2) 




J even, 

^AA 


and therefore* 
Prob. {M > n] 


irj{q +1) 

gin —-4; cot, 


^3 


= __ A _ V* cos’* _^ _ 

p + + 2 iA p + g + 2 p + g + 2 """ 2(p + g + 2)' 

where the star on the summation sign indicates that only odd j’s are taken under 
account. 

The method just illustrated is quite general but in more complicated cases 
the job of finding the eigenvalues and eigenvectors becomes formidable. 


* Professor Feller has called the author’s attention to the fact that similar problems and 
formulas can be found in Chapter III of W. Burnside’s Theory of Probability (Cambridge, 
1928). He also pointed out that the problem could be treated by means of Markoff chains. 
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Professor G. E. Uhlenbeck has pointed out that our fonnula izhp^ a known 
result from the theory of Brownian motion. 

Consider a free Brownian particle which at / = 0 is at x = xo(xo >0). R. ‘ 
Fiirth* has shown that the probability that between t and t + dt the partide 
will be either at x = 0 or at x — d (0 < xo < d) for the first time, is given by the 
formula 


or ffta-o d 

where D is the “coefficient of diffusion.” 

If we treat the one-dimensional Brownian motion as a random walk with steps 
±Ax, each move lasting Ai^ the probability that a particle starting from Xo will 
not have reached Q or d in the time interval (0, i) can be calculated by means 
of our formula. 

We must only put q = Xo/Ax, p = (d — Xo)/Ax, n = t/At and assume that as 
both Ax and At approach 0 the ratio (AxY/2At approaches the “coefficient of 
diffusion” D. 

An elementary computation shows that in this limit the Prob. [N > t/At) 
approaches 


4 1 

J 


TTjXii 


and that the differential of this expression (with a minus sign) gives exactly 
Fiirth’s expression. 


3. General theory in the continuous case. We now assume that the distribu¬ 
tion function of X possesses a continuous and even density function p(x). We 
have 

Prob. {AT > n} = j • • • j p(xi) • • • p(x„) dxi • • • dxn, 

0 

where the region of integration il is defined by the inequalities 
—q<xi<Py —g < Xi + X2 < p, • • • , —g < xi + • • • -f Xn < p 
Introducing the new variables .... 

2/y = g + + * “ + ^/ > 0 = 2, • • • , n), 

we see that the Jacobian of the transformation is 1 and 


Prob. {AT > n} 

( 3 ) fiP+9 pP+9 

^ i “ Jo 

Consider the symmetric integral equation 

(4) JT^ p(« - 0/ (t) dt = }if(a) 


y»-i) dp dpn 


»Ann. d. Phyt. 63 (1917) p. 177. 
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aud note that if K«i(«, 0 denotes the n-th iterated kernel of this integral equation, 
the right side of (3) is equal to 

Kn{q, t) di. 

Thus 

/•p+ff 

Prob. {AT > n} = / t) dt. 

From the general theory of integral equations we know that 



Knis, 0 = 1 : X"/K«)/y(0, (n > 2), 
y-1 

where Xi, ^ 2 , • • • are eigenvalues and /i(0, /*(0» • • * normalized eigenfunctions 
of the integral equation (4). 

Since p was assumed to be continuous it follows that the eigenfunctions are 
continuous and 

w Mp+q 

Prob. {AT > n| = 2 KUq) I fi(t) dt. 

• y-i Jo 

This formula is very general and provides, in a sense, a complete solution of the 
problem in the continuous and symmetric case. Unfortunately the usefulness 
(rf this formula is limited by the difficulties encountered in solving integral 
equations of the type (4). 

In fact, the integral equation 

to which one is led by considering the normally distributed X% appears to be 
very difficult to solve. 
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and differwtiating twice with respect to s we obtain the differential equation 

r(8) + (J - = 0. 

Substituting the general solution of this equation in (6) we find in an ^tirely 
elementary fashion that 

“ 1^' 

sin y,t + Vi cos yjt 

Vi+ i(p+ «)(! + »!)’ 

where is the jth (positive) root of the transcendental equation 


(7) 

We have 


tan (p + 5)y = - 


2y 


1 - 


f 

Jo 


in yjt + yj cos yjt) dl = - {1 — cos (p + q)yj + py an (p + q)yi\ 

ys 


and it is easily seen that (7) implies 


1 - cos (p H- q)yj + p,- sin (p + p)py = 


Finally, 


Prob. {)V > n) = 2 2' 


0 if cos (p + «)Pi = I Z_ 

1 + Vi 

2 if cos (p + «)py = -7-7-^ • 
1 + 


sin Pyg + py cos Pyg 


« (1 + p^)" py {1 + i(p + 9)(1 + p?)} ’ 

where the dash on the summation sign indicates that only those i’s are taken 
imder account for which 


cos (p + q)yj = - 


1 -y-j 
1 + p?’ 


We omit here the discussion of various limiting cases inasmuch as our main 
purpose was to obtain exact formulas. 

There are indications that some of the limiting cases are related to singular 
integral equations with continuous spectra. We may return to this subject 
at a later date. 



ON THE CLASSmCATTON OP OBSERVATION DATA 
INTO DISTINCT GROUPS 

By R. V. Mises 
Harvard University 

Introduction. In scholastic examinations as well as in the examination of 
industrial products the following probability problem arises. The individuals 
of a certain population are successively subjected to trials each of which leads 
to a definite score x (one real number or a group of m real numbers). Each 
individual is supposed to belong to one of n classes. These classes are character¬ 
ised by n probability densities pi(x), P 2 (x), • • • Pn(x). One has to decide on 
the basis of the observed value x to which class the respective individual belongs 
and one wishes to make this decision with the smallest possible risk of failure. 

For example, let us consider an examination where the three grades A, By C 
are attributed on the basis of a simple score x (case m = 1, n = 3). It may be 
assumed that an individual of the class A has a mean expected value of x equal 
to t>i = 75 and a normal distribution with the standard deviation ai = 4/\/2. 
The analogous values for the classes B and C may be ^2 = 50, a 2 = S/V^ and 

= 25, (Tj = 12/V2* III this case, the solution developed in the present paper 
allows the conclusion that the best way of grading would be to attribute the 
grade A to scores x beyond 70.0, the grade C to scores below 40.0 and B to the 
rest.'. The corresponding error risk will be 3.9% or the success rate 0.961. 

There exists, of course, one case where the solution is trivial. If the probability 
densities Pr{x) are limited to n non-overlapping regions Rp (with = 0 at points 
outside Up) an obvious decision can be made without any risk of failure. An 
assumption of this kind underlies the usual procedure of grading. If, in the 
foregoing exaipple, an individual of class A is supposed to have at any rate a score 
beyond GO and a class C individual less than 40, it is obvious how the grades 
should be attributed without incurring any risk. It seems, however, that in 
many problems the assumption of normal distributions or some other kind of 
overlapping distributions is more appropriate. Then, the probability problem 
has to be solved. 

The solution submitted in the present paper is derived from the simplest 
principles of calculus of probability without any arbitrary assumption or hypothe¬ 
sis. ’ If n equals 2, the problem can also be considered as a problem of testing 
a simple statistical hypothesis with a two-valued parameter.^ It has been 
in ah earlier paper* that under this restriction success rates higher than 
50% are obtainable. 


^See A. Wald, Annals of Math. Stat., Vol. 16 (1944), p. 146. Here, both pi {x) and pi (x) 
are supposed to be normal distributions with the same covariance matrix. The problem 
treated by Wald is different from the one considered in the present paper since in Wald’s 
paper the parameters of the two multivariate normal distributions are assumed to be 
unknown. 

*R. V, Mises, Annals of Math. Slat., Vol. 14 (1943), p. 238. 
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1. Statement of Hie problem. For each of n classes of individu^s a prob-. 
ability density = 1, 2, • • • n, is given. We subdivide the mrdimensional 

x-space into n regions Ri, R 2 , • • • Rn and assign the region R, to the vth class. 
The probability, for an individual of this class, to have its x-value falling in 
R, is 

(1) P, = f p.ix) dX, V = 1, 2, • • • n 

where dX denotes the element of the a:-space {dX = dx in the case m = 1). 
In the N first trials of the indefinite sequence of trials, Np individuals that 
belong to the vth class will be tested. Out of these only those individuals whose 
x-value falls in Rp will be ascribed to the vth class. Their number according 
to the definition of probability, equals Np{Pp + tp) where tends towards zero 
as Np goes to infinity. The total number of correct decisions during the N first 
trials is therefore 

(2) + €l) + N2{P2 + € 2 ) • • • Nr.{Pn + ^n) 

and the relative number is 

(20 ^ (Pi + O + (ft + **)+••• ^ (P- + *«). 

If N increases indefinitely a part of the Np must become infinite. For these 
classes, tp converges toward zero. For the other classes Np/N diminishes to 
zero. Thus, the relative number of right decisions converges towards 

(3) ^ (NxPi + ■ N„P„). 

The Np are unknown. Every one of these unknowns can take each value from 
zero to N, If P^ is the smallest P„, the most unfavorable case, where the 
expression (3) has its smallest value, will occur with AT, all other Np being 
zero. This value is obviously P^ . Thus it is seen that the frequency of correct 
assignments is at least equal to the smallest Pp which may be written as P„,in . 
The greatest risk of making a false decision is 1 — P,„in . 

Now the problem to be solved in the present paper can be stated as follows: 
For n given densities Pp{x), find the s^ihdivisimi of the x-space into n regions Rp 
that gives to the smallest of the expressions Pp defined in (1) its possibly greatest 
value. 

This problem has the type of a continuous variation problem with the integrals 
in question bounded within the limits zero to one. We may, therefore, assume 
that under reasonable restrictions for pp{x) a solution exists. Uniqueness of 
the solution cannot be expected in general. It seems very difficult to establish 
the conditions for unicity in other than the most simple cases. Existence of 
more than one solution would mean that each of them is an optimum with 
respect to infinitesimal modifications of the boundaries. 
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2. General solution. A simple problem of variation is considered as solved 
in principle when the nature of the extremals is known. In our case of a so- 
called minimax problem, where the minimum of n quantities is maximized, an 
additional relation between the n integrals is required. Both can easily be 
found in the actual case. 

Let us first consider a partition of the x-space into n regions with not all P, 
being equal. The smallest P, will be called Pmin and the smallest but one P’*'. 
Among the k regions for which P, = Pmin there will be at least one, say, i?« that 
has a common border with a region whose P-value is greater, so that P/j ^ 
P*. Now modify the boundary between Ra and R^ in such a way that the space 
covered by Ra is increased and that of R^ decreased. According to (1) the new 
values of P« and P^ will be 

(4) Pa =^P.+ A, P; = P^ - A' 

with both A and A' positive. The two quantities A and A' are not independent 
of one another, but they can be chosen both smaller than any given positive 
number €. Therefore, the condition 

(5) Pi = p„ + A < p, ~ A' = p; 

can be fulfilled. All other P»-values remain unchanged. 

In the case = 1, that is, if only one region Ry had originally the minimum 
P-value, the modified system has a greater minimum P, which equals either 
Pa + A or P*. If ^ > 1 the new system has the same minimum P as the original 
one, but its value is diminished by one. If we repeat the same procedure 
(A; — 1) times we obtain a system of regions with one single P, having the mini¬ 
mum P-value and the next step leads to a partition of the x-space into n regions 
with a smallest P-value that is greater than the original Pmin. Thus it is seen 
that no partition with unequal P,,-values can solve our problem. 

Secondly, if m > 1, consider a system of n regions with P = Pi = P 2 = • • • = 
Pn . Take two points, x and i/, on the border of any two neighboring regions 
Ry and R^ . An infinitesimal variation of the boundary would consist of adding 
to Ry in the neighborhood of the point x a space element bS subtracting it from 
Rn and, at the same time, adding to in the vicinity of y an element hS' sub¬ 
tracting it from Ry . Then, according to (1), the new values of Py and P,* will be 

(6) Pi = P + Py{x)dS - pyiy)5S' 

P'^ = P — Pf,{x)SS -h p^(y)BS\ 

Introducing A, = Pi — P and A^ = P^ — P, these equations solved for dS and 
bS' give 

( 7 ) ss = SS' = 


where 

(70 


D = p,(.x)p^{y) - p.(x)p,(y). 
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If the determinant D is positive, we find two positive quantities hS and 
for any pair of positive and A,. If D is negative the same is true when x and 
y are interchanged. In both cases, that is, with D 0, the original partition is 
replaced by a new system of regions in which only two regions, and , have 
increased P-values, while (if n > 2) still Pmin = P. If to this system the pro¬ 
cedure as described in the foregoing is applied, a final partition with a greater 
minimum value of P can be derived. The conclusion is that no solution of our 
problem can include a boundary on which the determinant D is different from 
zero for any two points x and y. On the other hand, it is seen that D = 0 means 
that the ratio Pw(x):p^(x) has a constant value along the border. Thus the 
result is reached: 

The partition of the x-apace that solves our problem is characterized by two proper’- 
ties: (1) for all n regions R, the value of Pp is the same; (2) along the border between 
Rp and P„ the ratio pX^)/Pii{x) is constant. 

In the one-dimensional case (m = 1) only the first of these two statements is 
relevant. In any case, the success rate, that is, the guaranteed ratio of correct 
decisions, equals the common value of all P„. 

3. Illustrations, (a) One-dimensional case. Upon introducing the cumula¬ 
tive distribution functions 


( 8 ) 



Py {z)dz 


the conditions Pi = P 2 = • • • Pn take the form 

(9) Fi{xi) = F 2 {X 2 ) — F 2 {Xi) = • • • = Pn-l(a-n-l) - P«-l(a:n-2) = 1 — Fn(Xn^i) 

where xi y X 2 j • • • Xn-i determine the n intervals on the both-sides infinite x-axis* 
If all density functions have the same form except for an affine tran8formation> 
one has 

(10) Fp{x) = F[K{x - d.)], ^ = 1, 2, • •. n 

Let us assume, for instance, that scores between 0 and 100 are attributed to 
three types of individuals. The first type may have an even chance to obtain a 
score between 0 and 50, the second between 40 and 80 and the third between 
70 and 100. Here 


( 11 ) 


Fy{x) = J -h (a: - ^y)pyy 


1 X — I ^ 


2py 


with &y = 25, 60, 85 and p, = A, The conditions (9) supply 

( 12 ) 


1 . - 25 1 ^ ^ _ 1 :r2 - 85 

2 + = 40 ^** " *‘) = 2 ■ -lO- 


and this, solved for xi , X 2 gives = 41 f, 0:2 = 75 while the three expressions 
(12) take the value 0.833. Therefore, in attributing all scores below 41| to the 
first class and all scores beyond 75 to the third one is safe to make under no 
circumstances more than i incorrect decisions in the long run. 
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In the example quoted in the introduction one has 


p,(x) * 




with ~ 75, 50, 25 and al 


8 , 32, 72. If denotes the integral 


the conditions (9) become 

(14) 1 + *(54^) - »(5^) - - 1 - *{^^)- 

The first and last expression equated lead to Xi + 8 x 2 = 250. The complete 
solution can be found with the help of tables for <&. It is xi = 29.9920, X 2 = 
70.0027 with the common value twice 0.961 for the three expressions (14). 
Hence the result as quoted in the introduction. 

Let us now take up the case of six normal distributions with equidistant 
mean values d = ±a, db3a, =fc:5a and one and the same variance Then, 
because of symmetry, two equations only have to be fulfilled: 

■ + ) - i^) - = i-^) - 

For <rV®* = 0*32, the numerical solution gives 

Xi = -“4.160a, X 2 == ■—2.062a. 


The success rate, i.e. half the common value of the above expressions is 0.931- 
The six intervals extend from —00 to Xi, from Xi to X 2 , from X 2 to 0, from 0 to 
—X 2 , from — X 2 to —xi, and from —xi to ». 

(b) Case of more than one dimension. Let us assume that two classes A and 
B have uniform distributions extending over volumes Fi = 1/pi and V 2 = I/P 2 
respectively. If the two regions have a common part of volume V each surface 
within the common space fulfills the condition pi/p 2 = constant. Thus, the 
two regions Ri and R 2 are not uniquely determined but subject to one condition 
only which determines the optimum success rate. If kV is cut out from Vi and 
(1 — k)V from F2 , the relation must be fulfilled: 

1 - p, Fk = 1 - p, F(1 - k), i.e. »c = 

Pi -T P2 

and the success rate is 

S = 1 - Pi Fk = 1 - = 1 - Pa F (1 - k). 

Pi + Pz 

If three classes Aj and C are considered with the densities pi = l/Vi , P 2 ^ 
1/^a» Vz = l/Fs and the first two regions have a space of volume V in common, 
the latter two a space of volume V\ the conditions are 

1 - PiV{l ~ k) = 1 P2{kV + XF') = 1 - Pad - X)F' 
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which supply 

K = 1 - P» + P» V + V' 

PlPi + PiPt + P»Pl V ’ 

^ j_ PiPtP» V +V' 

PiPi + PtP» PtP\ V' 
and the success rate has the value 


s = 1 - (V + V) ^ip» 

Pi Pi + P2 Ps + p8 Pi 

If the p„ are normal density functions, say 


P" (a-, y) 


Vd, 




Q, = a, (x — a,f + 2fi, ix — a,) ( 2 / — h,) -\r y,{y — h.f 

and Dp the corresponding determinants, the curves separating the regions Rp 
are the conics 


Qv — Qm = const. 

where the constants are determined by the conditions that all Pp must be equal* 
If the a, 7 have the same values for every v, the borders consist of straight 
lines. In this case one can reduce the expressions for pp , by an affine transforma¬ 
tion, to 

V,{x,y) = i 

TT 

In the transformed plane the borderline between the regions Rp and Ry, is per¬ 
pendicular to the straight line that connects the points Ap{ap , hp) and Ay{ay ,, 
If all points Ap lie on the same straight line (in particular, if n = 2) the 
whole problem is practically identical with the one-dimensional (m = 1 ). In 
the case n = 3 , in general, the three regions are confined by three lines pcr- 
fiendicular to A 1 A 2 ^ A 2 Az, A^Ai passing through a point C whose coordinates 
are determined by the equations Pi = P 2 = P 3 . If Tp denotes the distance 
ApC and <ppj ^p are the angles, ApC forms with the adjacent sides of the triangle 
^ 1 ^ 2 ^ 8 one has to use the function 

1 r** 

P(t, <pi) = 2 ^^^ ^ dz. 

Then the two conditions for C read 

F(ri, ^ 1 ) -f F(ri , i^i) ~ F(r 2 , + F(r 2 , 1 ^ 2 ) = F(r*, + F(r*, i?*) 

and the success rate equals 0.5 plus the common value of these three expressions. 



ON AN EXTENSION OF THE CONCEPT OF MOMENT WITH APPLICA¬ 
TIONS TO MEASURES OF VARIABILITY, GENERAL 
SIMILARITY, AND OVERLAPPING* 

Milton da Silva Rodrigues 

State University of Sdo Paulo 

1. Introduction. Given a frequency distribution D: [X,*, F^] {i = 
1, 2, 3, • • • , n), we shall call the expression 

Mr(D,Xi) = 'E(Xi- XiYFi 

the rth total moment of D about the origin Xj . We shall consider the weighted 
sum 

where W j denotes the weight corresponding to the particular origin Xy, and the 
summation is over a field In particular, if <l> is the set of all values assumed 
in D by the variate Z<, and if Wj = Fy, we shall call the quantity the rth com¬ 
plete total moment of D. If, on the contrary, W, is the frequency F] of the value 
Zy in a second frequency distribution 2>': [X] , Fy] and is the set of all values 
assumed by the variate Zy in D', will be called the rth aggregate moment 
of D and D\ A modification of this procedure leads to what we shall call the 
moment of transvariaiion of D and 2)'. 

The consideration of complete moments draws attention to certain previously 
known measures of variability which are independent of the origin selected, 
and also provides simple methods of computation which are useful for data 
given in the form of a frequency distribution. The investigation of aggregate 
moments and moments of transvariation gives rise to certain measures of general 
similarity between two distributions, as well as measures of the amount of over¬ 
lapping. 

2. Sliding and complete moments of a frequency distribution. 

2.1. We shall give the name sliding total moments of order r to the successive 
values, for particular values of j, of the expression 

(2.11) Mr {Xi) = Fi i; [(X< - XiY FJ. 


^The Portuguese original of this paper was written in Brazil, in August 1943. Its transla> 
tion into English was entirely revised by Dr. T. Greville, Bureau of the Census, who pro¬ 
posed also many simplifications in the derivation of formulae. For his painstaking labor 
and interest I wish to express my very sincere appreciation. I also wish to thank Dr. 
W. Edwards Deming for reading the manuscript and making several valuable suggestions. 
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The expression for the complete total moment, written out in full, is 
(2.12) aw. = S M, (X,) = S S [(X, - XjY F, F,]. 

»-l /-I 

It is readily seen that the complete moment is independent of the choice of 
origin. 

2.2. If r = 0, we have 

Mo(Xi) = Fit. Pi- 
•-1 

The complete total moment of order zero will therefore be 

( 2 . 21 ) m, = t Fii Pi = Ml 

i-1 

where Mq stands for the total moment of order zero about the origin of the X', 
that is, 

Mo = Nvo. 

2.3. If r = 1, we shall have 

MUXi) = FitliXi - Xi)F.l 

•-1 

Using Ml to denote the total moment of order one about the origin of the X, 
we obtain 

Ml (Xj) = Fj t XiFi - XiFi t Pi = P. Mi - XiFjMo . 

I I 


Making j vary from 1 to n and summing, we have 


(2.31) 


aw, = 


2 FiMi - D XjFjM, 

7-1 


= MoMi - MiMo = 0. 


This result is due to the fact that we took the deviations X, — Xy with their 
proper signs. We may, however, calculate the value which the complete moment 
of first order would have if using absolute values. Thus, the sliding total 
moment thus modified becomes 


1 Mi{X,) 1 = Fy fZ (Xy - Xi)Fi + t (X.- - Xy)Fyl 
L»-i »-/ J 

which may be put in the form 

(2.32) 1 M,(Xy) 1 = FyXy [g Fy - g F.] - Fy [g FyXy - g FyXy] . 
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Summing with respect to j and employing the substitutions 


(2.33) 


E Fi = Mo - 2 Fi 
EFiXi = Ml-’'EFiXi 


gives for the complete total moment 

(2.34) 11 = 2 g I^Fy Xy F,] - 2 g H Fi X,] . 


The quotient 
(2.36) 


wii 


|5!»i 




of the complete total moment of order one by the complete total moment of order 
zero we shall call the complete unit moment of order one, or simply the complete 
moment of order one, when no confusion would result. 

The complete unit moment is a measure of variability, identical with that 
already considered by Andrae and Helmert, respectively in 1869 and in 1876, 
and which C. Gini, in 1912, called mean difference with repetition.^ 

The numerator of mi is easily computed if we observe that the upper limit j — 1 
of the Fi summation, for example, means that each product XjFj must be multi¬ 
plied by the cumulative frecpiency corresponding to the class immediately pre¬ 
ceding. We only have to shift the cumulative frequencies column by one class 
in the proper direction; the second term is similarly dealt with. 


2.4. The second order sliding total moment is 

Mi(Xj) = FyE [(X< - XifFi] = F,Mi - 2FjXiMi + F^X^iM, 

teal 

where M 2 is the total moment of order two. Summing with respect to j gives 
the complete total moment of order two 


(2.41) = Z = 2(MiMo - M]). 

The complete unit moment of order two is therefore 


(2.42) 



= 2(4 - p?) 


*Apud Czuber, Wahrscheinlichkeitsrechnungf Vol. 2, (1932), p. 316. C. Gini, Varia¬ 
bility e Mutabilityt Cagliari, 1912. 
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where v' stands for a unit moment about the origin of the X, namely 

, SXT 

m 2 is also a measure of variability, independent of the choice of origin. It is 
equal to the square of Gauss’s “Prazisionsmass”, and to the double of Fisher’s 
variance; like mi it was defined by Andrae and Helmert, and was called byGini 
the mean square difference with repetition. 

2*5. If r = 3 we have for the sliding moments, 

MtiXj) = Fjt. (Xi - XjYFi 

t-l 

= FjM» - SFiXsMi + ZFiX)Mi - FiX)M„ . 

Summation over j gives 

(2.51) am, = S M^{Xj) = iWoMa - 3MiMa + SilfjMi - M.Afo = 0, 

J-1 

a result which is easily shown to hold for any complete moment of odd order. 
We may calculate the value of the complete moment of order three using absolute 
values of the deviations X,* -- X; by a process similar to that previously described 
£or the calculation of | 9}?i | . This gives 

i = 2 fi; FsX] £ F, - 3 2 FiX) S 

|_/«1 t-l 9.1 

(2-52) „ _ 

+ 3 E FfX, E FiX\ -T.FiZ FiX] . 

y-1 *-i 7-1 t-i J 


2.6. The sliding moments of order four are 

il/4(X,) = FjMi - - 4F/X}ilfi + FjX*Mo. 

Summing with respect to j and simplifying, we have 
(2.61) S»?4 = MoM, - iMiMs + ml ~ m^Mi + M 4 M 0 

= 2(MoMa - 4 M 1 M 3 + 3Ml), ' 


Dividing both sides by 2Ko in order to obtain the complete moment on a unit 
basis, we have 


[S - ’ (0] - 2 («: - 4.1^1 + 


But, if V indicates a moment about the mean 

1/4 = — 4v(vJ + 6^1 p2 — 
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By substitution, therefore 

rriA = 2 ( v 4 + 3p2^ — + 3v[*) 

(2.62) = 2 [va + 3iP2 - p'lf] 

= 2 {pa 4- 3 ^ 2 )* 

This complete moment gives rise to a measure of kurtosis independent of the 
choice of origin 

^ ^_!;l 1 ? 

ml 2 pI 2 * 

In case of mesokurtosis this reduces to 3, since for the normal curve p^/pI = 3; 
leptokurtosis and platikurtosis occur for the same ranges as in the case of Pear¬ 
son’s measure 182 . 

3. Aggregate moments of two frequency distributions. 

3.1. Given two frequency distributions, D:[Xi , = 1, 2, 3, • • • , n) and 

Z)': [Xj, Fj](j = 1, 2, 3, ■ • • , p) and a fixed point X'j belonging to the second 
distribution, we shall call the expression 

(3.11) Mr(D, X’i) = F'iJ: (Xi - X'iYF, 

the rth aggregate sliding total moment of the first distribution about the element 
Xy of the second. Summation over j gives 

(3.12) m = i; i; F'iiXi - x\y Fi. 

j-i 

We shall call the aggregate complete total moment or, simply, the aggregate 
total moment of D about />'. It is clear that this is a symmetric function of the 
two distributions, except for a change of sign in the case of odd moments. 


3.2. If r == 0, we have 

(3.21) Mo {D, X'i) = F'i Fi 

(3.22) m = i,F'j^Fi = Mo Mi . 

)-l x~l 


3.3. If r = 1, we have 


(3.31) Mi(D, X'i) = F'iMi - F'iX'jM^ 

(3.32) ‘a»i = Ml Mi - Mo Mi. 


We shall call the quotient 
(3.33) 


wii = 


«9K« 
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the aggreg^ate unit moment of order r (or the aggregate moment coefficient), 
or simply the aggregate moment of order r whenever the simpler name will not 
cause confusion. 

It is obvious that the aggregate moments are measures of general similarity, 
as to form and position, between D and Z>'. This similarity will be an identity 
in case the two distributions coincide perfectly; on the other hand, it is clear that 
there is no limit to the degree of non-similarity which may be encountered. We 
shall take unity to represent the maximum and zero the minimum of similarity, 
and thus define a provisional similarity index 


(3.34) 

But 


5 = 


mimi 

Wll 


mi 


MiMo - MoM'i 
Mo Mo 


= A - A' 


where A and A' stand for the arithmetic means of D and D', respectively. Now 
it will be seen that if A = A', ^ = oo. This result is due to the fact that in the 
calculation of mi and mi we took the absolute values of the deviations Xi — Xj, 
while in the calculation of ‘'mi we retained the algebraic signs. In order to make 
the two terms of the fraction in (3.34) comparable, we can either: 1) calculate 
‘'mi also using absolute values; or 2) take only the positive or only the negative 
part of both numerator and denominator of S. In any case, A = A' is a neces¬ 
sary condition for the maximum of S, 


3.4. We shall employ the first method suggested above, although we shall 
return to the second in the third part of the paper. As long as D and Z>' do not 
overlap, all the Xi — X] deviations have the same sign and this is the same as 
that of the difference A — A'. If, however, there is some overlapping this will 
not be the ca»se, some deviations having different signs from that of A — A'. 
This brings us to Gini’s concept of ‘Transvariation''. He applies this term to 
any deviation Xi — X] which does not have the same sign as X — X', these 
symbols denoting averages of any previously specified type; and he calls the 
magnitude of the deviation its “intensity". 

In computing the complete moment of the first order using absolute values, 
in order to simplify the algebra we shall assume the same origin for X and X* 
and therefore drop the stroke from the X, but not of course from the F. 
If certain values of X occur in one distribution and not in the other, we can 
merely consider the frequency as zero in the second distribution. In this way 
the two distributions can be regarded as extending over the same total range. 
If Xi and Xm denote the extreme values, the sliding total moment is 

1 MxiP, Xi) I = F; Fe {X, - Xi)Fi + E (X* - X,)F,1 
L<-i J 
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Summing with respect to j and at the same time employing the substitutions 
(2.33) or their transposed form, we obtain the following alternative expressions 
for the complete aggregate moment: 

(3.41) i m 1 = MiM', - MoM[ + 22 2 F*] - 2 2 2 

(3.42) Iml = 

J-1 L <-/ J 7-1 L »-7 

Note the similarity of the first of these forms to formula (2.34) which is in fact 
a particular case of formula (3.41). Alternatively, we may obtain from formula 

(3.42) the particular case 

(2.34a) I a»i 1 = 2 2 2 “ 2 2 (f;X y 2 f ) 

which is equivalent to (2.34). 

If the two distributions do not overlap, | ‘'9Wi | does not differ numerically 
from ‘'3D?i. Let us consider the case in which there is actual overlapping, the 
range of non-zero frequencies extending from Xi to Xn+p for D and from Xn+i to 
Xni for Z)'. Then formula (3.42) becomes, upon merely dropping all vanishing 
terms 


|ml = MoM[ -MiM'o 


(3.43) 


n+p p }—1 n n+p r n+p 

- 2 2 F’Xj 2 7?’. +22 F'i 2 F. X. . 

y—n+1 L t««n+l J y—n+1 L 


On the other hand, formula (3.41) reduces, under the same circumstances, to a 
much less simple expression, which upon making the substitutions (2.33) and 
simplifying reduces to 

iml = MoM[ - MiM'o + 22 \F'iXs 2 Tf-yl 

y—n+1 L j—n+l J 

n+p r n+p “I 

(3.44) - 2 2 

y-n+lL »—7 J 

n+p n+p n+p n+p 

-2 2 F'iX.X. Fy + 2 2 F', E FiXi. 

y—n+l tMn+L y*«n+l ta«n+l 


This result may be arrived at somewhat more easily by merely making the sub¬ 
stitutions (2.33) directly in formula (3.43). It may be noted that formula 
(3.44) at once reduces to the form (2.34) if the two distributions are identical, 
since the additional terms all cancel. It is, however a less satisfactory result 
than formula (3.43) because of the larger number of terms it contains. In order 
to obtain a formula which resembles (2.34) more closely, we may reverse the 
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order of summation in formula (3.43). Observing that terms for t 
collectively vanish, we see that 


lm| = MoM,'- 

- 2 if Ff, £ F'ix\^2 £ Ff^X* 2 f;]. 

»-n+l L J-n+1 J L J-S-l J 


It will be seen that the simple method of numerical computation described in 
section 2.8 is immediately applicable to all the formulas (3.41) to (3.46). Di¬ 
viding any of these expressions by "SKo gives j ‘'mi | . For example, if formula 
(3.43) is used, we have 


(3.46) 


Mo Mi 

Substituting this value in equation (3.34), we have 


n+p r n+p "1 n+p T n+p “1> 

Z F'iX, 2 F, - £ f; 2 FiXi i. 
?-n+l L J L J) 


(3.47) 


^1- 


mi mi 

I *Wli [2 


a quantity which we shall call the ^‘mean coefficient of similarity.’’ 

We now observe that Si is a general measure of similarity whose magnitude 
is affected by differences in either form or position. It may, however, be de¬ 
sirable to eliminate the position element, in order to isolate the form aspect. 
To do this it will suffice to relate the value which | ‘'mi | would have for A = A', 
to the product mimi. This value of | ‘'mi | is, in fact, its minimum; denoting 
it by ‘'mi we obtain the index 


(3.48) 


@1 


mi ml 



which we shall call the mean similarity ratio. 

It is clear that all the above mentioned indices measure overlapping as well 
as similarity. Overlapping between two distributions will be greatest when 
their similarity is greatest, or when | ‘'mi | is a minimum. In order to bring 
out more clearly the overlapping aspect we may follow Gini’s procedure of con¬ 
trasting the actual value of a measure with its maximum value. As already 
pointed out, if the form of the two distributions is held constant, but their rela¬ 
tive position is varied, the degree of overlapping, as measured by the mean simi¬ 
larity ratio, is greatest when the arithmetic means coincide. This method of 
procedure is embodied in the index 

(3.49) Zi = 

^ ®mi 

which we shall call the ^‘intensity of transvariation or overlapping.” To calcu¬ 
late 'mi we may, for example, merely add the difference A' — A = c to the X 
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values, in order to move D along the X-axis a distance of c, and then proceed to 
calculate | | in the usual manner from the adjusted X values. 

3.5. If, in (3.11), r = 2, we have 

M,{D, Xs) = (Xi - X,?Fi 

ti-1 

= Fj'Mj - 2X;F,'M, + X'i^F'iMt,. 

Summing for j then gives 

(3.51) ID?, = M'oMi - 2M'iMi + , 

If ,we define the second aggregate unit moment as 


(3.62) Mo Mo Mo ^ Mo 

= + (4 - 4')^ 

where the a and the ^4 stand for the standard deviations and the arithmetic 
means of the respective distributions. Now we define the “mean square co¬ 
efficient of similarity” as the value of 


4or a 

““ [<7* -f (r'2 + (A - A'YY' 

It is obvious that a minimum value of S 2 requires that A = A' as a necessary 
condition for the maximum degree of overlapping. Maximum similarity re¬ 
quires, in addition, a = <7-', in which case & = 1. 

For a measure of similarity which is independent of difference in position be¬ 
tween the two distributions, we define. 


where V 2 is the minimum value of ‘'m 2 for all positions of the two distributions, 
without changing their form. This is obtained by merely taking 

(3.65) V* = (T* + «r'*. 


For a measure of overlapping we can follow Gini in contrasting the actual 
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value of ®m 2 with its minimum Va > since the maximum of overiapping corresponds 
to the minimum value of . We thus set 


(3.56) 



a I /2 
<r + 0*' 

+ (A - Ay 


a measure which we shall call the ‘'density of overlapping^’. Its maximum 
value is unity. 

It may be remarked that all the indices proposed in this paragraph are easier 
to calculate than those of paragraph 3.4. The individual terms are all functions 
of only one of the two distributions; yet the resulting indices are independent of 
the origin chosen, and therefore free from any criticism based on doubt as to the 
representativeness of the arithmetic mean, in cases of marked skewness. 


4. Positive and negative moments, and moments of transvanation. 

4.1. The aggregate sliding total moment of two frequency distributions D 
and D' may be expressed in the form 

(4.11) X') = F'i E (Xi - XiYFi + F; E (X, - XiYFi 

*-l »-;+l 

when both distributions have been artificially extended, if necessary, to cover 
the same total range, as previously described in section 3.4. We shall char¬ 
acterize the second term in the right member of (4.11) as the positive sliding 
moment, and the absolute value of the first term as the negative sliding moment. 
We shall denote these moments by ^Mr(D, Xj) and ~Mr{D, Xj), The complete 
moments obtained by summing these separate terms over the range of values of 
j we shall call the positive and negative aggregate complete moments. Thus 
the positive complete moment is 

(4.12) = E [f; E (Xi - X,)'F<1 
and the negative complete moment is 

(4.13) -'aWr = E [f'i S (Xi - XiY F.] . 

That one of these two partial moments which is obtained from differences X* — 
Xj having the opposite sense to that of the difference X — X' will be called the 
moment of transvariation of the two distributions and will be denoted by the 
symbol ^9Kr. Here, as in section 3.4, X and X' denote averages of any pre¬ 
viously selected type. For example, if the arithmetic means are the averages 
selected, and if A — A' is positive, then the negative aggregate moment is the 
moment of transvariation, and vice-versa. 

In the trivial case in which the two distributions are identical, the positive 
and negative complete moments are equal, and both reduce to merely one half 
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the aggregate complete moment (computed by the use of absolute values in the 
case of moments of odd order). 

The unit moment of transvariation will be defined as 


(4.14) 


mr 


'aw, 

'a)?« 


4.2. It is evident that the moments of transvariation can be considered as 
measures of overlapping. Any such moment equals zero when there is no over¬ 
lapping and becomes greatest when the two distributions coincide. Taking unity 
to represent the maximum and zero the minimum of overlapping, we may choose 
as a general measure of overlapping, 


(4.21) 


4V _ 

^|m;i |2«,||a«;i 


It will be seen that this quantity always equals zero when there is no ovcrlappmg, 
and equals unity when there is complete overlapping: that is when the two dis¬ 
tributions are identical. 


6. Need for further developments. All of the measures above described 
were defined for the case of finite sets of magnitudes, expressed as frequency 
distributions D and D\ Now these sets of magnitudes may be thought of as 
samples drawn out of their corresponding universes. The consideration of these 
universes would lead to more general representations under the form of frequency 
functions, and the above measures would be expressed as definite integrals rather 
than summations. This draws attention to the need for tests of significance of 
the magnitude of all the above measures, especially those of overlapping, in 
order to allow for sampling fluctuation. Obviously, when the frequency func¬ 
tions are of the asymptotic type some amount of overlapping will always exist. 



ON A PROBLEM OF ESTIMATION OCCURRING IN PUBLIC OPINION 

POLLS 

^ By Henry B. Mann 

Ohio State University 

To arrive at an estimate of the number of electoral votes that will be cast for 
a presidential candidate a poll is taken of \iN interviews in the ith state (t ~ 1, 
• • • , 48) where the X< are fixed constants > 0 such that « 1 and the re¬ 
spondent is asked for which candidate he intends to cast his vote. To estimate 
the number of electoral votes which candidate A will receive, the electoral votes 
of all the states in which the poll shows a majority for candidate A are added 
and their sum is used as an estimate for the number of electoral votes which 
candidate A will receive. In this paper certain properties of this estimate will 
be discussed. It will be shown that it is a biased but consistent estimate and 
an upper bound for the bias will be derived. Finally we shall derive that dis¬ 
tribution of interviews which minimizes the variance of our estimate. 

In all that follows we shall consider the poll as a random or stratified random 
sample and shall disregard the bias introduced by inaccurate answers. Our 
results however remain valid as long as the sampling variance is proportional 

We shall use the following notation: 

TTi = proportion of voters in the zth state who intend to vote for candidate A, 

€i = 1 if TTi > i 
0 if TTt < ^ 


Wi = number of electoral votes of the fth state. 

Pi, e* == sample values of n and €» resp. 

We shall further exclude the case ir*- = 

The number of electoral votes for candidate A is then given by 

€i Wi = r. 


As an estimate of V we use the quantity 

(1) Tf^^eiWi = G. 

Let Pi be the probability that pi > J and hence d == 1. Let \iN = Wi be the 
number of interviews in the tth state. If Ni is not too small then pi is given by 


( 2 ) 


Pi 


/; 


1 


\^2ir<rt 
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In this formula <r, « jf sample is an unstratified random 

sample and may be somewhat less if the sample is a stratified random sample.* 
For our purposes it is sufBcient to assume that a is proportional to . 

We then have F(e,) p» and 
(3) *= EBiWi) =s pi Wi, 

Hence G is a biased estimate of T. On the other hand* plim p* « and 
hence plim e* = €< and therefore plim G = r. That is to say G is a con- 


sistent estimate of r. 

According to (3) the bias is given by 

(4) B(N) = Wi - Et\* Pi Wi = («< - Pi) v>i . 


We have 

« Pi =* “^TTW" f e~***da: if < i 

1 rd-rOlfTi 

^ ^ ^ ^ ^ *• 

For a stratified as well as for an unstratified sample a is proportional to 
and we therefore put 



(5) 


i — ^ /T'<v"JV«_if vt < i 

<ri YiViVi if *■<>!' 

Then we have in both cases 

( 6 ) 

We have for a > 0 

f e~**’ dx < h(e~*‘' + + 

Ja 

< e~*‘*h{l +e~^ + e^ + •••) 


= e"^"* 


for every value h, 
k 

Since lim ;- 

1 “ 6 

(7) 


“ we have 
a 


j: 


j^ah 




e""*** dx < ^— for eveiy a > 0. 
o 


^ The variance in public opinion polls is somewhat larger than the random sampling 
variance due to the fact that a cluster sample is used and not a random sample. For the 
same reason the estimate p< of may be biased. 

• For the notation used here see; H. B. Mann and A. Wald, “On stochastic limit and 
order relationships”. AnnaU of Math, Stat,, (1943), pp. 217-227. 
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JVom (6) sad (7) we obtain 


< 8 ) 


I «< - P< I 


< 




From (4) and ( 8 ) we have 


Formula (9) .is valid whenever ir*- ^ J and shows that B(N) converges rapidly 
to 0 for all values ir< 5 ^ 

To obtain an approximate idea of the magnitude of the bias we may in (4) 
Teplace €» and p,- by their sample values e* and r*. The quantity 2{Ii* w< 
(Ci — r<) can, however, not be regarded as an estimate of B(N), 

We now proceed to compute the standard error of (?. We may consider the 
poU as 48 single experiments where the probability of success in the zth experi¬ 
ment is given by pi where 


r 

/2t Jy, 


\/2t 


-ix* 


dx 


11 - Pi 


iTi < i 

if > i* 


Hence the variance of G is given by 

<10) Pi (1 — p*)w^?. 

As an estimate of a* we can use the quantity ^ obtained by replacing p, by 
its sample value. 

We shall consider that distribution of interviews as best which minimizes 
Em - D^]. 

We have 


Em - r)'] = + B\N) . 

We therefore consider the problem of minimizing <r* + B^(N) under the restric¬ 
tion 

We have 


da* _ 
mi "" 


da* dpi 

dpi dNi 


= trUl - 2p,) 


d^{N) 

9Ni 


2B{N) 


dB(N) 

dNi 


-2wiB{N) 


dpi 

dNi 


dpi 

dNi 





yi_ 

2VNi 


if *■.• < i 


1 




2VNi 


if «••■ > J. 
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Hence applying the method of Lagrange operators, we obtain 

(11) ^ _ 2^) - 2Bm = X, t - 1 ... 48. 

2j5Ii* Ni = N. 

The parameters yi and x, in equation (11) can be estimated from a previous 
poll.* It is not certain that (11) has always solutions. However if the quantity 
<r* + B\N) has a minimum for a set of values , • • • , N 4 & with Ni 0 (f = 1, 
• • • , 48) then (11) must have a solution 
One might be induced to try to estimate S pm directly by using r, = 
1 f* 

“ 7 = / da; as an estimate of p,-. It is easy to see that u is a con- 

V2ir 

sistent estimate of €i . It will be shown however that this estimate is more 
biased than the estimate (1). 

Since cr, differs only very little from its sample estimate s* vfB may replace this 
sample estimate by c*. We then have 

- 2^ C (C 

= I" dx dpi. 

ix - Pi? + (Pi - nf = + 2 (p* - 

= 2^? /f (/7 dp)jdx. 


Now 


Hence 


The second integral is equal to -s/ii^. Hence 


~x*/2 


iV2 


dx. 


• If for any i were very close to i then it would be of little use to poll the ith state. 
Hence, in this case formula (11) gives a small value for Ni . However, the are never 
accurately known. The following procedure might be recommended for determining the 
best distribution of interviews: If for one particular % the sample value of in as estimated 
from a previous poll is too close to ^ determine, using the Ni of the previous poll, that value 
iti of m for which the probability is that pi is larger than J and substitute in (11) 
for n . In all other cases substitute the sample value. 

If several polls are taken it is advisable to use all of them but the last one to estimate 
as closely as possible the values of the m . The sample of the last poll before the election 
should be distributed according to (11). 
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From (12) we see that E(xi) < if ir< > i and E(xi) > p< if < J. 

Thus in every case this estimate is more biased than the estimate (1). 

On the other hand, we-shall now show that F[(e, — r<)*] is always smaller than 
E[(€i — eO*]. Since €»• = 1 if ir< > i and «* = 0 if i it is easy to verify that 
E[(ei — r,-)*] has the same value for ir< = o as for t* = 1 — a and the same is true 
for E[{€i — ei)% We may, therefore, without loss of generality assume that 

Thus we have to show that 


(13) B(r!) < £(ej) = p, = r dx ifT. <i 

We have 

Now 

Qix, y, Pi) = (x - pif + (jj - Pif + (Pi- Tif 

x + 


(pi - 1 (;. + y - 2,,)* + I (X - »)*. 


Putting 




Pi = 


\/6 

, _L - y) 1 - 

“ \/2 ’ Vlci - 


(a: + y - 2ir<) 


<ri 


we obtain 




. ‘f, 

Now for IT,- = J we have a = 0, and for < J we have a > 0. For a = 0 we 
obviously haveE(r? < E(e\), Further lim E{r\) = lim E{e\) = 0 hence (13) 

a—»oo o-*flo 

is proved if we can show that 

F(o) = E(r]) - £(e?) = ^ e-**’ ^1/ 
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is a monotonically increasing function of o. Differentiating F(a) with respect 
to a we obtain 


dF{a) ^ _\/3 r* ^ V3 

^ Jo 2y/v 


da 


(14) 


“~V^3 ^~(3/4)a 


^ -V3 -(»/4)a* / 


•'a 

f 

*fa 


e 


4(x-(8/4)o)* 


V3 ^-(8/4)0* 


2\/ TT 

^-(8/4)0* 


dx + e* 


Hence for o > 0 we have 


^ > z:^ «-»-• + e-^‘ > 0. 

^ 2 v ^^ 2\/t 


Hence we have proved 


( 16 ) 


- n-)’] = ~ r e-*^' g-»‘ aydx< Eliti - eO*], 


a = 


Vacjai- 
1 - 27r4 


V6 




Since 


E[(ti - e.)*] - - uf] 

is largest when = § we also have 

> I- P. I - [1 - i f P-- * * ] 


or 

(16) \^i-Pi\> £[(«.- - n)*] > ^[ e-^-’ dj/ rfa: - I i - « I. 

Because of (15), r< although more biased may in many cases be preferable 
to Ci as an estimate of €i . 



NOTES 

This section is devoted to brief research and expository articles, notes on method¬ 
ology and other short items. 


A COMBINATORIAL FORMULA AND ITS APPLICATION TO THE 
THEORY OF PROBABILITY OF ARBITRARY EVENTS^ 

By Kai-Lai Chung and Lietz C. Hsu 
National Southwest Associated University, Kunming, China 

An important principle, known as a proposition in formal logic or the method 
of cross-classification can be stated as follow’s.^ 

Let F and / be any two functions of combinations out of (v) == (1, 2, • • • , n). 
Then the two formulas 

(1.1) F((a)) = 2 /((a) + (,8)) 

</J) • (»')~(a) 

(2.1) /((«)) = 2 (-1)‘^’((«) + (P)) 

03) « (O—(«) 

are equivalent. 

As an immediate application to the theory of probability of arbitrary events, 
we have the set of inversion formulas* 


(3.1) 

p((a)) = 

E 

pl(«) + 08)] 



(/3) € (p)-(a) 


(4.1) . 

pl(«)] = 

E 

(^) € <r)-.(a) 

(-1)V((«) + 08)) 


where p({ot)) is the probability of the occurrence of at least , • • •, 

out of n arbitrary events Ex, E^, • • ,En and pl(a)] is the probability of the 
occurrence of Ea^, Ea ^, • • • , Ea^ and no others among the n events, (ai, at, 
• • •, tto) denoting a combination of the integers (1, 2, • • • , n). They can be 
made to play a central r61e in the theory, since they supply a method for con¬ 
verting the fundamental systems of probabilities, p[(a)] and p((a)), one into the 
other. 

We may further generalize (1.1) and (2.1) by considering combinations with 
repetitions. Let such a combination be written as 

(a) = W) = (al^aj* • • • a?) 

1 For the notations and definitions see K. L. Chung, “On fundamental systems of prob¬ 
abilities of a finite number of events,’* Annals of Math. Slot., Vol. 14 (1943), pp. 123-133. 

* Cf. FaficHET, Les prohahiltUs associles d un sysihme d^SvSnements compatibles et depen¬ 
dants, Hermann, Paris (1939), formulas (55) and (58). 
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where r< (r* > 1) denotes the number of repetitions of the number , i * 
1, 2, • • • , a. Correspondingly we write 

(a)' = (aia 2 * • * Oa) 

and call it the reduced combination corresponding to (a). 

If there are n distinct elements (1, 2, • • •, n) in question, we may write every 
combination in the form 

(^2”* ••• n’’") 

where each Vi is zero or a positive integer. We say that (1*^2"* • • • n*”) belongs 
to (r*2*’® • • • n’’’*) and write 

(1*^2** • • • n*“) c (r‘2’‘* • • • n") 


if and only if for each i, i = 1, 2, • • • , n, we have Si < r*, We write 
(r»2"» ‘ • • n"") + (1*^2** • • • n*-) = 
andif (1**2*^ • • • n'") c (r^ 2’^* • • • n “), 

(r'2"* • • • n”") - (1*^2*^ • • • n*") = 


We define a generalized Mobius function m((«)) for combinations (with or with¬ 
out repetitions) as follows 


m((«)) = 


(-1)- if {a) = (a)' 
0 if (a) 5^ (a)'. 


This function has the property 


D Km 

m * (a) 


1 if (a) = (0) 

0 if (a) 9^ (0). 


For we have 


E m((«) = 

(B) • (a) 


E 


( p ) € ( a )' 


i-iy = E (-i)‘ 



_ 1 if a' = 0 __ 1 if (a) = (0) 

~ 0 if a' 0 ”” 0 if (a) 9 ^ (0). 

Now we state and prove the following general theorem. 

Theorem. Let (a),- *= (v),- = 

where \ij and Ui are finite and 1 < r,y < , 1 < a,* < n,- /or z = 1, 2, • • • , m 

and j = 1, 2, • • • , n<. Then for any two functions of the m combinations {with 
repetitions), (a)i, (o!) 2 , • * • , (a)m out of (v)x, {v) 2 , • • • , (»')m , the two sets of 
formulas: 


^((a)i, («)2, • • • (a)m) 

Z) /(Wl + , («)2 + (/3)2 , • • (a)m + {0)m) 


(1) 
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<md 




( 2 ) 


£ Fn M(Wolf((a). +(/S)i, (a)s+(/?),, 

o)< • (*•)»—(<»)< L*-i J 


<xre equivalent. 

Proof. To deduce (2) from (1) 


E 

</3)< « (>')i-(o)< 


[n m((/3)o] F((«)i + Wi, • • •, («). + (/3)») 

= £ fn m ((^).)1 £ 

(/J)< « (•')i-(a).- L«-l J (T)< • Wi-Wi-Wi 


+ (^)l + (y)i . • • •) («)« + (^)m + Wm) 
£ /((“)l + (5)l . • • •. (a)m + (S)tn) 

( 3 ){ e (>')»—(«)» 


• £ nM((6).-- (t)i). 

(7)i € («)< t-l 


Evidently we have 


m m 


£ nM(8). - 

1 

II 

a 

M 

M((5)t - iy)i)l 


<y)i € (6),- i-i 

((y)i * (3)» 

) 


m 

= n; 

t-l 1 


if (6)i = (0) for t == 1, • 
otherwise 

• •, m 


by the property of the /x-function. Hence the preceding sum i*educes to 
/((«)i, * • • , («)m) in accord with (2). 

(1) is deduced from (2) in a similar way. 

Although the general case is not without importance in the treatment of 
several sets of events,® we shall for the sake of convenience restrict ourselves to 
the special case m = 1. 

In order to apply these formulas we must first introduce combinations with 
repetitions into the theory of arbitrary events. This can be done in various 
ways. Firstly, we may consider the number of occurrences of each event in a 
given time-interval or in a series of trials not necessarily independent. Secondly, 
we may regard each event as possessing various degrees of intensity. If the 
event occurs r* times in a given time-interval or occurs with r, degrees of 
intensity, we write it as EY, Hereafter we shall make use of the first interpreta- 

® Cf. FicfecHET, Loc. Cit. pp. 50-52; also, K. L. Chung, “Generalization of Poincar6^s 
formula in the theory of probability,” Annals of Math. Slat., Vol. 14 (1943). We may note 
that our general theorem may be used to give another proof of the generalized Poincare’s 
formula for several sets of events. 
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tion and we shall assume that the maximum number of occurrences of each event 
is finite: 


0 ^ r* ^ X», f « 1, • • • , n. 

We define 

p[E[^ • • • E^'n] = p[(0] = The probability that Ei occurs exactly r,- 
times in the given time-interval. 

p{EV * • • = p((/)) = The probability that Ei occurs at least r, 

times in the given time-interval. 

These quantities play the same r61e as the pt(a)]’s and p((a))'s in the ordinary 
theory. Evidently the probability of every complex event in question can be 
expressed as the sum of certain p[(/)l’s. To prove that the p((/))’s also form 
a fundamental system of quantities we have only to express p[(/)]'s in terms of 
the This is given immediately by an application of the general 

theorem with m *= 1. For we have in an obvious way 

p(JEV ■ ■ ■ K’) = E p[E{^ ■ ■ • E'n'] 


or 


(3) 


(H) f 


Hence we obtain the inversion 


(4) 


- Z m((0)p((/) + M). 

(!-•)«(;r)-(ro 


Let (a') denote a running combination without repetitions. Then since iu((/)) = 
0 unless (/) is a {v'), 


(40 ?[(/)]= m((«0)p((»'0 + («0) 

(«') c 


B , (- irp ((''0 + ( a 0 )* 

(a') ( 


The set of formulas (3) and (4) generalize (3.1) and (4.1). 

Corresponding to the PmHv)) for the ordinary events we define for a -|- & + 

• • • = n and r, 5, • • • all distinct: 

••• -E^n") = The probability that among n events Ei, E^ , 

• • • f En exactly a events occur r times, exactly h events occur s times and so on. 
By (4) we easily obtain 


P[olM&]*,-*.((>'^)) 

® -s. s. 


S (»») • (p^)-((a)»'+(/S)*+...) 


m((0)p((>'*) + (ay + 08)*+ •••) 


where (a)* = (Eh • ■ ■ J, 03)* = (E^ • • • SJ,), • • ■ and the first summation 
is a symmetric sum which extends to all nl/alb! • • • different combinations 
(«i • • • a«), (/Si • • • /S»), • • • out of (v) = (1, 2 • • ■ n). 

The equality (5) is obviously a generalization of Poincare's formula. 
Similarly for the probabilities in the definition of which the word “exactly” 



MECHANICS OP CLASSIFICATION 95 

is sometimes substituted for the words ‘‘at least/' Of course we can express 
all of them in terms of the p[(/)]'8 or of the p((/))’s. However elegant formulas 
such as in the ordinary theory seem to be lacking. 

Finally, we may also consider conditions of existence for the p[(/)]'s and the 
For the former system the conditions are that they be all non-negative 
and that their sum be 1. For the latter system, the conditions are given by 
(4'), viz. for every (/) c (/), 

E, M((a'))p((/) + («)) ^ 0. 

(a') t 

These conditions are necessary and sufficient since (3) and (4) are equivalent. 


ON THE MECHANICS OF CLASSIFICATION 

By Carl F. Kossack 
University of Oregon 

1. Introduction. Wald^ has recently determined the distribution of the 
statistic U to be used in the classification of an observation, (i = 1, 2, • • • , p), 
as coming from one of two populations. He also determined the critical region 
which is most powerful for such a classification. It is the purpose of this paper 
to show how such a classification statistic under the assumption of large sampling 
can be applied in an actual problem and to present a systematic approach to the 
necessary computations. 

The data used in this demonstration are those which were obtained from the 
A.S.T.P. pre-engineering trainees assigned to the University of Oregon. The 
problem considered is that of classifying a trainee as to whether he will do un¬ 
satisfactory or satisfactory work^ in the first term mathematics course (Inter¬ 
mediate Algebra). The variables used in the classification are: (1) A Mathe¬ 
matics Placement Test Score. This is the score obtained by the trainee on a 
fifty-minute elementary mathematics test (including elementary algebra). 
The test was given to each trainee on the day that he arrived on the campus. 
(2) A High School Mathematics Score. A trainee’s high school mathematics 
record was made into a score by giving 1 point to students who had had no high 
school algebra, 2 points to students with an F in first-year, high-school algbra 
and no second-year algebra, 3 points for a D, • • • ,10 points for an average grade 
of A in first- and second-year algebra. (3) The Army General Classification 
Test Score. An individual needed a score of 115 or better in order to be assigned 
to the A.S.T.P. These data were obtained for 305 trainees along with the. actual 

^ Abraham Wald, a statistical problem arising in the classification of an individual 
into one of twQ groups,*^ Annals of Math. Stat., Vol. 15, (1944), No. 2. 

* Unsatisfactory work was defined as a grade of F or D in the course (failure or the lowest 
passing grade). 
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grade made by them in the algebra course, 
were not included in the study. 

Trainees who had had college work 

2. Steps in the Computation of U and the Critical Region. Let 

Ti be the population of individuals who do unsatisfactory work in their first- 

term mathematics coiu^e. 


t 2 be the population of individuals who do satisfactory work. 

Nj and = respectively the number of observed individuals in ti and n . 

xia and 2/ia = respectively the Mathematics Placement Test Score for the 

ath individual observed in vi and ra 


X 2 a and p 2 a — respectively the High School Mathematics Score. 

Xia and yza = respectively the Army General Classification Test Score. 

Step 1. Computation of Summations 


Ni = 96 

Nt = 209 

2a:, a = 3570 

2l^i. = 11450 

Sa:j. = 547 

“ 2 ^ 2 . = 1567 

= 11745 

2y,. = 26684 

2a:f. = 145476 

2j/!. = 672452 

Sx|. = 3509 

2y|„ = 12577 

Zxla = 1439559 

2y|. = 3421996 

Sxi.a:,„ = 21012 

2yiaP2a — 88774 

XxiaXza 436964 

'Syi^ta = 1469302 

^X2aXZa = 66731 

^Vtayta = 200150 

S{x,. - fi)* = 12716.625 

2 ( 1 / 1 . - = 45167.311 

S(X2. - = 392.240 

2 ( 1 / 2 . — i/s)* = 828.249 

2(x,. - X,)* = 2631.656 

2 ( 1 /.. - Pa)* = 15125.876 

S(xia ~ xi){xta ~ it) = 670.438 2(yia — yi)iyta ~ Ss) = 2926.392 

S(xi. - fi)(x,. - £,) * 196.812 S(j/,„ - - ®,) = 7427.359 

^(xta - i2){xta - Xt) = -191.031 

2(1/2. - » 2 )(l/a. - 9») - 83.837 

Step 2. Computation of Statistics. 


X, = 37.188 

= 54.785 

*2 = 5.6979 

yt = 7.4976 

» 122.3438 

gz = 127.6746 

2(x,. - f,)(xy. - x^) -4- 2(y<. - yi)(yja - yf) 

Ni + N»-2 

Sii = 191.04 

«B = 11.871 

Saa = 4.0280 

81 a 25462 

Sss — 58.606 

Sas ^ ““ .35378 
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Step 3. ComputaMon of Inverse Matrix \ | 

191.04 11.871 25.162 

1«4;|= 11.871 4.0280 -.35378 = 34053 

25.162 -.35378 58.606 

= .0069286 8** = -.020692 

s” = .31019 s“ = -.0030996 

s“ = .018459 8“ = .010756 

Step 4. ComputaMon of the Classification Equation, 

U = [8»(^?1 - Xl) + s“(2/2 - X*) + 8“(5. - 

+ [s“(jri — xi) + 8“(j2 — fi) + s^\y» — x»)]-Z2 

+ — ^i) + — Xj) + 8’*(§j — £»)]'Zt 

where Zi plays the same role for individuals to be classified as Xia and yia do for 
observed individuals. 

U = .068160 zi + .25147 Zj + .063215 Za 
Step 5. Computation of the Critical Region {assuming Wi = 

ai = .068160 xi + .25147 + .063215 Xz « 11.702 

€t2 = .068160 2/1 + .25147 + .063215 yz = 13.691 

+ «2) = 12.696 

Therefore, 

For JJ < 12.696 classify the individual as coming from ti population. 

For U > 12.696 classify the individual as coming from population. 

Step 6. ComputaMon of the Efficiency of Classification. 

a = s”(yi - £i)(yi - «i) + s‘*(yi - xi)(yi - xt) + 8“(y: - ii){yt - «,) 

+ «*‘(§2 - xs)(yi - £i) 4- - Xi)(St - Xi) + s^iSi - £t)(St- St) 

+ s”(y* - St)(yi - Si) + 8”(ifj - xt)iyi - ^) + 8“(g» - St)(ii - St) 

= 1.5764. 



where Pi is the probability of making an error of Type I, that is, of classifying 
mi individual as one who will do satisfactory work when he actually does un¬ 
satisfactory work; and 1 — Pj is the probability of making an error of Type II, 
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that is, of classifying a student as one who will do unsatisfactory work when he 
actually does satisfactory work. 

3. Conclusions. In using the above classification equation to classify the 
305 trainees used in this study, 21 errors of Type I were made or 22.9 percent, 
while 50 errors of T 3 rpe II were made or 23.9 percent. These percentages seem 
reasonably close to the expected 20.6 percent. 


NOTE ON AN IDENTITY IN THE INCOMPLETE BETA FUNCTION 


By T. a. Bancboft 
Iowa Stale College 


Since the incomplete beta function has proved of some importance in statistics, 
it would appear that any additional information concerning its properties might 
at some time prove useful. In a paper by the author, [1], two identities in the 
incomplete beta function were incidentally obtained. They are as follows; 

(1) (p + 5)/*(p, q) = p/*(p + 1, g) + g/x(p, g + 1) 

and 

(2) (p + g + l)^*^/x(p, g) = (p + l)^^^/»(p + 2, g) + 2pgZ*(p + 1, g + 1) 

+ (p + l)«’/.(p, q + 2), 


where the incomplete beta function Jx(p, g) 


Bx(p, g) 
B(p, g) 


, etc., and (p + 1)^^^ 


etc. refer to the standard factorial notation. 

Written in the above form these two identities suggest a possible general 
identity to which they belong as special cases. The third special case suggested is: 


(p + g + 2)^*^/x(p, g) = (p + 2)^^^/x(p + 3, g) 


(3) + 3(p + l)^^^gJx(p + 2, g + 1) + 3p(g + l)^^^/x(p + 1, g + 2} 

+ (g + 2)^*^/x(p, g + 3). 


The general formula suggested is 

(4) (p + ? + n - 1)'"' 7, (p, q) = (p + n - r - 1)'"“’'' 

• iq + r - I)''’ Ix(p + n — r, q + r). 


To prove the general formula we write (4) as 

(5) (p + q + n - 1)'"' Ixip.q)^ 2 (P + « - »• - 1)'“"'’ 

B.(p + » - r, 5 + r) 
B(p + n-r.g + r) * 
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9» 


( 6 ) 


By expanding and simplifying it is easy to show that 

(p+n-r-l)‘"^»(g+r~ 1)''’ _ (p + g + n - 


B(p + n - r, g + r) 

Using (6) the right hand side of (5) reduces to 


B(p,g) 


The summed function in (7) reduces to 


( 8 ) 


(1 - x)* ' [x + (1 - x)]" dx = B, (p, g), 


which proves the identity. 

Although the general identity is quite simple to prove, it does not seem to 
have appeared in the literature. 


REFERENCE 

il ] Bamcboft, T. a. "On biases in estimation due to the use of preliminary tests of sig¬ 
nificance,” Annals of Math. Slat., Vol. 16 (1944), No. 2. 



NEWS AND NOTICES 

Readers are invited to submit to the Secretary of the Institute news items of interest 

Personal Items 

Archie Blake is now employed as a ballistician with the Ballistic Research 
Laboratory at Aberdeen Proving Ground. 

Robert V. Bonnar is now employed as Associate Technologist at the Mare 
Island Navy Yard. 

Professor W. G. Cochran has returned to his regular duties at Iowa State 
College. 

Mrs. Bianca Cody (Bianca Rivoli) is now Statistician for the James 0. Peck 
Research Company, 12 East 4l8t Street, New York City. 

Associate Professor William Feller of Brown University has been appointed 
Professor of Mathematics at Cornell University. 

Professor John Kenney of the University of Wisconsin is now located at the 
Milwaukee branch of the University. 

Myra Levine is now Assistant Mathematical Statistician with the Statistical 
Research Group at Columbia University. 

Mrs. Harold Michaelis (Ruth E. Jolliffe) is 5th Naval District Statistician 
at the Naval Operating base in Norfolk, Va. 

Emma Spaney is Statistician for the Committee on Measurement of the 
National League of Nursing Education. 

Professor J. A. Shohat of the University of Pennsylvania died October 8,1944. 

Mr. Bedford T. Webster of the Western Electric Company died July 31,1944. 


New Members 

The following persons have been elected to membership in the Institute: 

Boddle, John B., Jr. Chief, Program Section, Budget Division, Washington, D. C. S628 
Tunlaw Road, N.W. 

Bruner, Nancy M.A. (Iowa) Statistician, Western Auto Supply Co., Kansas City, Mo. 
7611 Main St, 

Christopher, Edward E. B.S. (Mass. Inst. Tech.) Statistician, Signal Corps. 6704 North 
Both St., Arlington, Va. 

Cowden, Dudley J. Ph.D. (Columbia) Prof, of Economics, Univ. of North Carolina. 
Box 616, Chapel Hill, North Carolina. 

Cynamon, Manuel M.S. (City Coll., N. Y.) Personnel Tech., Personnel lies. Sec., Adj. 

General’s Office, War I^pt. 10 Ave. P, Brooklyn 4y N. Y. 

Evensen, Edward J. On military leave from Metropolitan Life Ins. Co. (Actuarial Sec.) 
Sv. Co., Ut Sp. Sv. Force. 

Green, Earl L. Ph.D. (Brown) 1st Lieut., A.C., Chief, Dept, of Statistics. AAF School 
of Aviation Medicine, Randolph Field, Texas. 

Groves, William Brewster B.S. (Antioch) Economist, Off. of Price Administration. 
620 Decatur St., N.W., Washington, D. C. 
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Homieth, Richard AUen M.A, (Wiaoonain) Res. Assistant in Sociology^ Univ. of Wiscon¬ 
sin. B07 N, Randall^ Madison 6, Wis, 

Kinsler, David M. M.A. (Chicago) Chief, Anal 3 rtical Section, Arms db Ammoidtion Divi¬ 
sion, Aberdeen Proving Ground, Maryl^d. 

Kopp, Paul J, M.A. (Duke) Major, Chemical Warfare Service, U. S. A. 1906 Narlh 
Adams Arlington^ Fo. 

Matacy, Frank Jones, Jr. M.A. (California) Associate, Dept, of Math., Univ. of Cali¬ 
fornia, Berkeley, Calif. 1964 Union St., San Francisco 9, Calif. 

Orcutt, Guy H. Ph.D. (Michigan) Instr. Economics Dept., Mass. Inst, of Tech., Cam¬ 
bridge, Mass. 

Rakeaky, Sophie M.S. (Michigan) Statistician, W. K. Kellogg Foundation, Battle Creek, 
Mich. 

Roberta, Jean M.S. (Minnesota^ Statistician, Child Welfare Res. Analyst, 999 Good¬ 
rich Ave.f St. Paul 6, Minn. 

Schietroma, William B.S.6. (Coll, of City of N. Y.) Research Assistant. S16 East 119th 
St., New York, N. Y. 

Schlorek, Mary A. A.B. (Adelphi) Research Statistician, National Broadcasting Co., 
30 Rockefeller Plaza, New York, N. Y. 

deSousa, Alvaro Pedro B.E. (Liverpool) Vice-Governor, Banco de Portugal. Monserrate, 
Rua Infante de Sagres, Estoril, Portugal. 

Steele, Floyd George M.S. (Calif. Inst, of Tech.) Stat. Analyst, Douglas Aircraft. 1B16S 
Roosevelt Highway, Pacific Palisades, Calif. 

Thom, Herbert C. S. 613018th Rd., N., Arlington, Va. 

Report of the Fifth Pittsburgh Chapter Meeting 

The fifth meeting of the Pittsburgh Chapter of the Institute of Mathematical 
Statistics was held at Engineering Hall, Carnegie Institute of Technology on 
Saturday, November 25,1944. The meeting was held as a joint session with the 
Pittsburgh Quality Control Society. Thirty-one persons attended the meeting, 
including the following six members of the Institute: 

George Eldredge, H. J. Hand, C. R. Mummery, E. G. Olds, E. M. Schrock, J. V. Sturte- 
vant. 

The following papers were presented, with Mr. J. V. Sturtevant, of the Car¬ 
negie Illinois Steel Corporation, acting as chairman: 

1. Modified Application of Control Chart to the Use of Gauges on Machine Tool Work. 

Dr. E. G. Olds, War Production Board, Washington, D. C. 

2. Application of Control Charts to Infrequent Inspection of Machine Operations. 

W. D. Angst, Thompson Aircraft Products Company, Cleveland, Ohio. 

3. Application of Control Chart Techniques to Checking Reproducibility of Chemical 
Analysis. 

H. A. Stobbs, Wheeling Steel Corporation, Steubenville, Ohio. 

4. Statistical Principles of Experimental Design as Applied to Tests Conducted in Manu¬ 
facturing Operations. 

Dr. B. Epstein, Westinghouse Electric & Manufacturing Co., East Pittsburgh, Pa. 

H, J. Hand, 

Secretary-Treasurer^ Pittsburgh Chapter 
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Educational Meetings of the Pittsburgh Chapter 

The first of a series of educational meetings on methods of statistical computa¬ 
tions given by the Pittsburgh Chapter was held on Saturday afternoon, January 
20, 1945, Thirty-three persons attended the meeting, including the following 
three members of the Institute: 

Thomas A. Elkins, H. J. Hand, J. V. Sturtevant. 

The following program was presented: 

1. Potential Field for Industrial Applications of StatisHcal Method. 

H. J. Hand, National Tube Company, Pittsburgh,^a. 

2. Computations for Analysis of Variance and Experimental Design. 

Ben Epstein, Westinghouse Electric & Manufacturing Company, East Pitts¬ 
burgh, Pa. 

It is planned to hold these meetings bi-weekly, on Saturday afternoons for an 
indefinite period in the future. Topics to be considered in the series will include: 

1. Analysis of variance and covariance. 

2. Design of experiments. 

3. Tests of significance. 

4. Probability and probability distributions. 

5. Correlation and regression analysis, including the orthogonal coordinate method. 

6. Tests of increased severity. 

7. Sampling theory, including stratification. 

8. Acceptance-rejection mathematics. Dodge sampling inspection tables. 

9. Shewhart control chart techniques. 

10. Analysis of runs. 

11. Cycle analysis. 

12. Factor analysis. 


H. J. Hand, 

Secretary-Treasurer^ Pittsburgh Chapter 



ANNUAL REPORT OF THE PRESIDENT OF THE mSTFTUtE 

Continuing the established tradition, the annual summer meeting was held at 
Wellesley, Massachusetts, August 12-13, 1944 in conjunction with the Summer 
Meetings of the American Mathematical Society and the Mathematical Associa¬ 
tion of America. A regional meeting was held in Washington, May 6-7, in 
conjunction with the meeting of the Washington Chapter of the American 
Statistical Association. The programs were arranged by the Program Com¬ 
mittee: W. Feller, Chairman, W. G. Madow, and A. Wald. 

Even though, under present war conditions, research in the field of probability 
and statistics is very much curtailed, enough papers in mathematical statistics 
of satisfactory quality have been proposed for publication in the Annals in 1944 
to keep the total volume of material at approximately five hundred pages or the 
level of the last few years. However, the outlook for a sufficient number of 
satisfactory papers to maintain the usual volume of publication during 1945 does 
not look quite so favorable. 

Looking into the future, the Institute must continue to furnish, through the 
Annals^ a medium for the publication of all important results of original research 
in the field of mathematical statistics as they become available. To do otherwise 
would be suicide. At the same time we must take account of the growing need 
for comprehensive surv^eys of statistical theory on the part of other scientists, 
including not only social scientists but also physicists, chemists, biologists, and 
research engineers, whose interest in the contributions of mathematical statistics 
has been greatly stimulated during the war. Only the mathematical statiscian 
of broad competence can provide adequate critical surveys of this character. 
Perhaps some of this need can be met through survey articles published in the 
Annalsy although it is not an easy matter to get capable men to do such work. 
Perhaps the time is not far off when the Institute must stimulate the preparation 
of such material by instituting an annual series of Colloquium Lectures patterned 
somewhat after those of the Mathematical Society, which could be published 
separately. 

This is but one of many problems that the Institute faces in its post-war 
development. Not only must it assume the responsibility of stimulating and 
encouraging research and of publishing the results; it must also consider the 
problem of training the research statistician of tomorrow as well as those who 
are to apply mathematical statistics in the many fields of science. It also must 
assume some responsibility for keeping in contact with other scientists in order 
that the mathematical statistician may become acquainted with the unsolved 
statistical problems of the scientist. There are also many problems of a pro¬ 
fessional character that face the mathematical statistician in the future if he is 
to succeed in developing the profession of mathematical statistics to the level 
attained by some of the older scientific professions. 

With the realization of the need for a concerted attack on some of these 
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problems, the Board of Directors at its meeting in May set up two committees, 
one on Training and Placement of Statisticians under Harold Hotelling and the 
other on Post-War Development of the Institute under W. G. Cochran. In¬ 
terim reports received by the Board from both committees indicate that consid¬ 
erable progress has been made to date. They also indicate, however, that much 
more work remains to be done. 

At the same meeting of the Board, a Budget and Finance Committee was set 
up, consisting of P. S. Dwyer, Chairman, C. H. Fischer, A. C. Olshen, and C. F. 
Roos, to prepare a report on the policy that should be followed by the Institute 
in respect to such items as investment of funds, advertising, preparation of an 
annual statement, and the like. Some of the work of this committee has already 
borne fruit, as, for example, in providing the actuarial basis for life membership 
adopted at the Wellesley meeting and in establishing certain principles to be 
used in conducting the business of the Institute. 

A report of the Committee on Membership, W. G. Cochran, Chairman, P. S. 
Dwyer, and T. Koopmans, appears elsewhere in this issue of the Annals, Upon 
recommendation of this committee, the Board of Directors elected nine new 
fellows: Walter Bartky, C. I. Bliss, Gertrude M. Cox, P. A. Horst, M. G. Ken¬ 
dall, H. B. Mann, E. S. Pearson, Henry Scheff6, and W. A. Wallis. 

The nominating committee for the year consisted of John Curtiss, Chairman, 
E. G. Olds, and F. F. Stephan. G. W. Snedecor served the Institute again as its 
representative on the Council of the A.A.A.S. 

The annual election of the Institute just concluded by mail ballot resulted 
in the election of the following officers for 1945: W. E. Deming, President; W. G. 
Cochran, and J. L. Doob, Vice-Presidents. 

Walter A. Shewhart 
President, 1944 

February 10, 1945 



ANNUAL REPORT OF THE SECRETARY-TREASURER 
OF THE raSTITUTE 

Accounts of the 1944 meetings of the Institute—^the Wellesley meeting, the 
Washington regional meeting, and the Pittsburgh chapter meetings—have ap¬ 
peared in appropriate issues of the Annals. 

At the Wellesley meeting a number of amendments to the Constitution and 
By-Laws were passed. These were published in the September, 1944, issue of 
the Annals. (The amended Constitution and By-Laws appear elsewhere in this 
issue.) 

Due to a large extent to the cooperation of the membership in sending in nom¬ 
inations, the Institute enjoyed a large increase in membership during the year. 
There were some resignations and it was necessary to suspend fifteen persons at 
the end of 1944 because of failure to pay dues. It is apparent that, in some of 
these cases at least, our mail is not being received. Undoubtedly some of these 
memberships will be restored when contact is again established. As of January 
1, 1945, there were 606 members, a net gain of approximately one hundred 
members. 

During the year the Institute received gifts from Professor Harry Carver in 
the form of exchanges for early issues of the Annals^ reprints of early articles, etc. 

The Secretary-Treasurer wishes to acknowledge the continued assistance of 
Professor Lloyd Knowler in looking after the back issues of the Annals which are 
stored at Iowa City. 

The following financial statement covers the period from December 22, 1943 
to December 31, 1944 (the books and records of the Treasurer have been audited 
by Professor Thomas A. Bickerstaff and were found to be in agreement with the 
statement as submitted): 

FINANCIAL STATEMENT 
December 22, 1943, to December 31, 1944 
Receipts 


Balance on Hand, December 22, 1943 $3,715.05 

Dues 

1944 and before. $2,995.31 

1945 and 1946. 1,127.00 

Life. 330.00 

- 4,452.31 

Subscriptions 

1944 and before. $1,301.94 

1945 and 1946. 883.94 

2,185.88 

Sale op Back Numbers . 1,385.02 

Miscellaneous . 6.15 

Total Receipts. $11,744.41 
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Expendituhe 

Annals—Current 

Office of Editor. $273.77 

Waverly Press. 8,448.51 


3,722.28 

Annals—Back Numbers 

Purchase from H. C. Carver. $149.40 

Iowa City Office. 96.26 


245.66 


$377.00 

68.02 

455.94 

55.79 


956.75 

Miscellaneous . 29.07 

Balance on Hand, December 31, 1944 . 6,790.65 


$11,744.41 

No unpaid bills were in the hands of the Treasurer as of December 31, 1944, 
and aside from an additional $100.00 jyhich the Board has designated for Annals 
expense for 1944, there were no large bills outstanding. 

Accounts receivable as of December 31, 1944, amounted to $303.73. Many 
of these accounts are current accounts while some of the older ones are accounts 
with firms in India, which probably will be collected eventually. 

The American Library Association continued with its purchase of thirty sets 
of Volume XV of the Annals (for post war distribution) and the Universal Trad¬ 
ing Corporation (representing the Chinese Government) purchased twenty 
sets of Volumes 11-17 inclusive. These orders contributed in no small way to 
the total 1944 income of $8,029.36. 

The 1944 balance $6,790.65 (consisting of bank balance of $3,790.65 and 
$3,000.00 in government bonds) is $3,075.60 higher than it was on December 21, 
1943. This increase is due in part to 1944 business and in part to the fact that 
unusually large payments toward future business, such as the $330.00 in life 
payments and the $1,127.00 in 1945 and 1946 dues, have been made. 

To summarize the situation briefly, the Institute’s 1944 activity has resulted 
in a gain of approximately $1,500.00 and we are about this much in advance 
of our usual position with reference to the payments of following years. 

Paul S. Dwyer 
Secretary-Treasurer. 


Office of Secretary-Treasurer 

Printing, mimeographing, programs, etc. (including stamped 

envelopes). 

Postage and supplies. 

Clerical help. 

Moving office from Pittsburgh. 


December 31, 1944 













REPORT OF THE MEMBERSHIP COMMITTEE OF THE mSTITDTE 

Since the duties of this Committee are not defined in detail in the Constitution, 
the Board of Directors asked the Committee to prepare a statement describing 
the appropriate composition and function of the Committee on Membership. 
This work resulted in the preparation of amendments to the Constitution and 
By-laws. These amendments were passed at the business meeting at Wellesley 
College on August 13, 1944, and are printed in full in the September, 1944, issue 
of the AnnaU (p. 340). 

In brief, the duties of the Committee are specified as follows in these amend¬ 
ments: 

(a) The Committee holds the power of election to the grades of Member and 
Junior Member and makes recommendations to the Board of Directors with 
reference to placing members in the other grades of membership. 

(b) It is the duty of the Committee to prepare and make available through 
the Secretary-Treasurer an announcement of the qualifications necessaiy for 
the different grades of membership and to review these qualifications periodically. 

(c) The Committee considers plans for increasing the number of applicants 
for membership. 

As permitted by the amendments referred to above, the power of election to 
the grades of Member and Junior Member was delegated by the Committee in 
August, 1944, to the Secretary-Treasurer, subject to certain reservations. The 
statement of qualifications for the different grades of membership as mentioned 
in (b) above is published below. At the August 13 meeting of the Board of 
Directors it was decided that no elections should be made at present to the grades 
of Honorary Member and Sustaining Member. 

On the recommendation of the Membership Committee the following members 
were elected as Fellows by the Board of Directors: W. Bartky, C. I. Bliss, G. M. 
Cox, P. A. Horst, M. G. Kendall, H. B. Mann, E. S. Pearson, H. Scheff^, W. A. 
Wallis. 


Statement of Qualifications for the Different Grades of 
Membership in the Institute of 
Mathematical Statistics 

Member, The candidate shall either (a) be actively engaged in or show a 
serious interest in mathematical statistics, or (b) be interested in some applied 
field of statistics, with a desire to keep himself informed regarding recent develop¬ 
ments in mathematical theory and techniques. 

Junior Member, 

1. Any undergraduate student of a collegiate institution is eligible for election 
as a Jimior Member of the Institute of Mathematical Statistics provided that he 
or she is sponsored by a member of the Institute. 

2. The annual dues ($2.50) must be submitted with the application. 
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3. Annual membership shall coincide with the calendar year and the Junior 
Member shall receive a complete volume of the Annals of MathemaUcd Statistics 
for the year in which he or she is elected. 

4. Junior Membership shall be limited to a term of two years, but a Junior 
Member may apply for transfer to ordinary membership at the beginning of his 
second year. 

Fellow. 

1. The candidate shall have evidenced continuing activity in research in 
mathematical statistics by publication beyond his doctor’s dissertation of in¬ 
dependent work of merit. Normally two or three worthwhile papers beyond the 
dissertation will be required to establish this fact. 

2. The first qualification may be partly or wholly waived in the case of (a) 
a candidate of well-established leadership among mathematical statisticians whose 
contributions to the development of the field of mathematical statistics other 
than sufiSdent published original research shall be judged of equal value or (b) 
a candidate of well-established leadership in the applications of mathematical 
statistics, whose work has contributed greatly to the utility of and the apprecia- 
ticm for mathematical statistics. 

Honorary Member. A p^on of exceptional ability and acknowledged leader¬ 
ship in the field of mathematical statistics may be elected to the grade of Hon¬ 
orary Member by the Board of Directors, upon the recommendation of the 
Committee on Membership. 

Sustaining Member. The Board of Directors shall have the power to elect to 
Sustaining Membership any individual, group or corporation that is interested 
in furthering the purposes for which the Institute was formed. 

W. G. Cochran (Chairman) 
W. E. DliMING 
P. S. Dwyer 
T. Koopmans 


February 10,1945 



PROGRESS REPORT OF THE COMMITTEE OT POST-WAR 
DEVELOPMENT OF THE IHSTmiTE 

In considering the post-war development of the Institute of Mathematical 
Statistics, the Committee has recognized two general problems: 

A. The problem of what additional activities the Institute should undertake 
in order to provide further stimulus to the development of the field of 
mathematical statistics. 

B. The problem of determining how the Institute can cooperate more effec¬ 
tively with the users of statistical techniques. 

Because of rapidly increasing interest in the application of statistical methods 
in many different fields, the Committee has directed most of its attention thus 
far to Problem B; the present progress report is concerned with the work of the 
Committee on this problem. The Committee hopes to submit a report on 
Problem A at the end of 1945. 

With respect to Problem B, it is the opinion of the Committee that a central 
organization for the statistical societies should be of common interest. Accord¬ 
ingly, a plan was worked out and submitted to the Board of Directors of the 
Institute at the Wellesley meeting of the Institute. This proposal and ita 
present status are discussed below. 

We believe that there is much to be gained from an organization that would 
form a link between the various statistical societies, and would have the following 
principal aims: 

(1) To represent the members of the societies in all matters of common interest. 

(2) To promote cooperation between statisticians working in the different 
fields of application, and between mathematical statistics, applied statis¬ 
tics, scientific research and the industries. 

(3) To develop amongst the public an appreciation of the value of the statoti- 
cal method in scientific inquiry. 

It is our opinion that an oiganization similar to that of the Institute of Physics 
would be suitable. The statistical societies, while retaining their present auton¬ 
omies, would become founding members of a corporation whose governing 
board would contain representatives from each society. In pursuance of its aims 
as outlined above, the new organization might: 

(a) Take the lead in formulating policies on questions which concern all 
statisticians. 

(b) Publish a journal of general interest to statisticians and imdertake the 
routine work in connection with the publication of the journals of the 
individual societies, the societies retaining in full their present responsi¬ 
bility for the contents of their journals. 

(c) Arrange joint meetings between different statistical societies and between 
statistical and other scientific societies. 
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(d) Assist new groups in oiganizing for their benefit, either under the auspices 
of one of the present societies or in a new society, which might at first be 
given associate membership and later full membership of the central 
organization. 

(e) Take steps to bring news about the use of statistics in scientific research 
to the attention of the public and more particularly of leaders in industry, 
in federal, state and local agencies and in education. 

(f) Investigate the demands for various types and degrees of statistical 
training, outline courses of training in statistics suitable for meeting these 
demands and make strenuous efforts to have the recommended courses 
of training put into effect, in order that statisticians can be of fullest 
service in the nation’s work. In this connection an information and 
placement bureau may be an appropriate auxiliary. 

(g) Institute an abstracting service in statistical methodology. This might 
take the form of a periodical publication of abstracts of papers with respect 
to their methodological content rather than their subject matter. The 
coverage would include journals of business, marketing, engineering, 
medicine and agriculture as well as purely statistical publications. 

The financial needs of the new organization, which would maintain a paid 
full-time staff, may be met initially by contributions from the present societies. 
In view of the extra services which would be rendered to statisticians, some 
increase in the subscription rates of the present societies appears reasonable. A 
member who belongs to more than one of the present societies would pay the 
extra amount only once. Supplementary income might be derived from ad¬ 
vertising in the journal of the central organization and from the establishment 
of sustaining or corporate memberships in the central organization. 

At the time of the Wellesley meeting of the Board, there had been only in¬ 
formal contacts between members of this Committee and members of other 
statistical societies. We considered it our first task to obtain some consensus of 
opinion from the standpoint of the Institute of Mathematical Statistics. Fol¬ 
lowing general approval by the Board of Directors of the Institute, members of 
the Committee discussed the proposal for a central organization \\ith representa¬ 
tives of several other statistical societies. The American Statistical Association 
has a Committee to consider the future structure of the Association and this 
Committee brought the Institute proposal before ^he Board of Directors of the 
Association for action. As the oldest of the statistical societies, the American 
Statistical Association then invited participation in an intersociety committee 
by the Institute and nine other societies or sections, directly or indirectly con¬ 
cerned with statistical method. This committee is to explore the possibilities of 
coordinating the activities of the several statistical societies and report its 
recommendations back to each organization. The representatives have now 
been named and the first'meeting was held on February 10, 1945, in New York. 
At this meeting the Institute was represented by W. G. Cochran and Lt. John 
H. Curtiss. 
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With regard to the problem of what additional activities the Institute shoxild 
undertake in order to furnish additional stimulation to the developmort of the 
field of mathematical statistics, the Committee has discussed several ideas which 
appear promising. It is hoped to present a complete report on this phase of the 
Committee’s work at the end of this year. 

C. I. Buss 

W. G. Cochran (Chairman) 
W. E. Deuinq 
P. S. OufSTEAD 
S. S. Wilks 


February 12, 1946 



CONSTITUTION 

OF THE 

INSTITUTE OF MATHEMATICAL STATISTICS 

ARTICLE I 
Name and Pubposb 

1. This organization shall be known as the Institute of Mathematical Statistics. 

2. Its object shall be to promote the interests of mathematical statistics. 

ARTICLE II 

Membership 

1. The membership of the Institute shall consist of Members, Junior Members, Fellows, 
Honorary Members, and Sustaining Members. 

2. Voting members of the Institute shall be (a) the Fellows, and (b) all others, Junior 
Members excepted, who have been members for twenty-three months prior to the date of 
voting. 

3. No person shall be a Junior Member of the Institute for more than a limited term as« 
determined by the Committee on Membership and approved by the Board of Directors. 

ARTICLE III 

Officers, Board of Directors, and Committee on Membership 

1. The Officers of the Institute shall be a President, two Vice-Presidents, and a Secre- 
taiy-Treasurer. The terms of office of the President and Vice-Presidents shall be one year 
and that of the Secretary-Treasurer three years. Elections shall be by majority ballots at 
Annual Meetings of the Institute. Voting may be in person or by mail. 

(a) Exception. The first group of Officers shall be elected by a majority vote of the in¬ 
dividuals present at the organization meeting, and shall serve until December 31, 1936. 

2. The Board of Directors of the Institute shall consist of the Officers, the two previous 
Presidents, and the Editor of the Official Journal of the Institute. 

3. The Institute shall have a Committee on Membership composed of a Chairman and 
three Fellows. At their first meeting subsequent to the Adoption of this Constitution, the 
Board of Directors shall elect three members as Fellows to serve as the Committee on 
Membership, one member of the Committee for a term of one year, another for a term of 
two years, and another for a term of three years. Thereafter the Board of Directors shall 
elect from among the Fellows one member annually at their first meeting after their elec¬ 
tion for a term of three years. The president shall designate one of the Vice-Presidents as 
Chairman of this Committee. 


ARTICLE IV 
Meetings 

1. A meeting for the presentation and discussion of papers, for the election of Officers, 
and for the transaction of other business of the Institute shall be held annually at such 
time as the Board of Directors may designate. Additional meetings may be ciffied from 
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time to time by the Board of Directors and shall be called at any tiime by the President 
upon written request from ten Fellows. Notice of the time and place of meeting shall be 
given to the membership by the Secretary-Treasurer at least thirty days prior to the date 
set for the meeting. All meetings except executive sessions shall be open to the public. 
Only papers accepted by a Program Committee appointed by the President may be pre¬ 
sented to the Institute. 

2. The Board of Directors shall hold a meeting immediately after their election and 
again immediately before the expiration of their term. Other meetings of the Board may 
be held from time to time at the call of the President or any two members of the Board. 
Notice of each meeting of the Board, other than the two regular meetixigs, together with a 
statement of the business to be brought before the meeting, must be given to the members 
of the Board by the Secretary-Treasurer at least five days prior to the date set therefor. 
Should other business be passed upon, any member of the Board shall have the right to 
reopen the question at the next meeting. 

3. Meetings of the Committee on Membership may be held from time to time at the call 
of the Chairman or any member of the Committee provided notice of such call and the 
purpose of the meeting is given to the members of the Committee by the Secretary- 
Treasurer at least five days before the date set therefor. Should other business be passed 
upon, any member of the Committee shall have the right to reopen the question at the 
next meeting. Committee business may also be transacted by correspondence if that 
seems preferable. 

4. At a regularly convened meeting of the Board of Directors, four members shall con¬ 
stitute a quorum. At a regularly convened meeting of the Committee on Membership, 
two members shall constitute a quorum. 

ARTICLE V 

PUBUCATIONB 

1. The Annals of Mathematical Statistics shall be the Official Journal for the Institute. 
The Editor of the Annals of Mathematical Statistics shall be a Fellow appointed by the 
Board of Directors of the Institute. The term of office of the Editor may be terminated at 
the discretion of the Board of Directors. 

2. Other publications may be originated by the Board of Directors as occasion arises. 

ARTICLE VI 
Expulsion ob Suspension 

1. Except for non-payment of dues, no one shall be expelled or suspended except by 
action of the Board of Directors with not more than one negative vote. 

ARTICLE VII 

Amendments 

1. This constitution may be amended by an affirmative two-thirds vote at any regularly 
convened meeting of the Institute provided notice of such proposed amendment shall have 
been sent to each voting member by the Secretary-Treasurer at least thirty days before the 
date of the meeting at which the proposal is to be acted upon. Voting may be in person or 
by mail. 
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BY-LAWS 

ARTICLE I 

Duties of the Officers, the Editor, Board of Directors, and CoM&aTTEB on Mem¬ 
bership 

1. The President, or in his absence, one of the Vice-Presidents, or in the absence of the 
President and both Vice-Presidents, a Fellow selected by vote of the Fellows present, shall 
preside at the meetings of the Institute and of the Board of Directors. At meetings of the 
Institute, the presiding officer shall vote only in the case of a tie, but at meetings of the 
Board of Directors he may vote in all cases. At least three months before the date of the 
annual meeting, the President shall appoint a Nominating Committee of three members. 
It shall be the duty of the Nominating Committee to make nominations for Officers to be 
elected at the annual meeting and the Secretary-Treasurer shall notify all voting members 
at least thirty days before the annual meeting. Additional nominations may be sub¬ 
mitted in writing, if signed by at least ten Fellows of the Institute, up to the time of the 
meeting. 

2. The Secretary-Treasurer shall keep a full and accurate record of the proceedings at 
the meetings of the Institute and of the Board of Directors, send out calls for said meetings 
and, with the approval of the President and the Board, carry on the correspondence of the 
Institute. Subject to the direction of the Board, he shall have charge of the archives and 
other tangible and intangible property of the Institute, and once a year he shall publish in 
the Annate of Mathematical Statistics a classified list of all Members and Fellows of the 
Institute. He shall send out calls for annual dues and acknowledge receipt of same; pay 
all bills approved by the President for expenditures authorized by the Board or the Insti¬ 
tute; keep a detailed account of all receipts and expenditures, prepare a financial statement 
at the end of each year and present an abstract of the same at the annual meeting of the 
Institute after it has been audited by a Member or Fellow of the Institute appointed by the 
President as Auditor. The Auditor shall report to the President. 

3. Subject to the direction of the Board, the Editor shall be charged with the responsi¬ 
bility for all editorial matters concerning the editing of the Annals of Mathematical Sta- 
Ustics. He shall, with the advice and consent of the Board, appoint an Editorial Commit¬ 
tee of hot less than twelve members to co-operate with him; four for a period of five years, 
four for a period of three years, and the remaining members for a period of two years, ap¬ 
pointments to be made annually as needed. All appointments to the Editorial Com¬ 
mittee shall terminate with the appointment of a new Editor. The Editor shall serve as 
editorial adviser in the publication of all scientific monographs and pamphlets authorized 
by the Board. 

4. The Board of Diredtbrs'shall have charge of the funds and of the affairs of the In¬ 
stitute, with the exception of those affairs specifically assigned to the President or to the 
Committee on Membership. The Board shall have authority to fill all vacancies ad in¬ 
terim, occurring among the Officers, Board of Directors, or in any of the Committees. The 
Board may appoint such other committees as may be required from time to time to carry 
on the affairs of the Institute. The power of election to the different grades of Member¬ 
ship, except the grades of Member and Junior Member, shall reside in the Board. 

5. The Committeie on Membership shall prepare and make available through the 
Secretary-Treasurer an announcement indicating the qualifications requisite for the differ- 



ent grades of membership. The Committee shall review these qualifications periodically 
and shall make such (Changes in these qualifications and make su^ recommendations with 
reference to the number of grades of membership as it deems advisable. The power to 
elect worthy applicants to the grades of Member and Junior Member shall reside in the 
Committee, which may delegate this power to the Secretary-Treasurer, subject to such 
reservations as the Committee considers appropriate. The Committee shall make recom¬ 
mendations to the Board of Directors with reference to placing members in other grad^ 
of membership. The Committee shall give its attention to the question of increasing the 
number of applicants for membership and shall advise the Secretary-Treasurer on plans 
for that purpose. 


ARTICLE II 
Dues 

1. Members shall pay five dollars at the time of admission to membership and shall receive 
the full current volume of the Official Journal. Thereafter, Members shall pay five dol¬ 
lars annual dues. The annual dues of Junior Members shall be two dollars and fifty cents. 

The aimual dues of Fellows shall be five dollars. The aimual dues of Sustaining Members 
shall be fifty dollars. Honorary Members shall be exempt from all dues. 

(a) Exception. In the case that two Members of the Institute are husband and wife 
and they elect to receive between them only one copy of the Official Journal, the annual 
dues of each shall be three dollars and seventy-five cents. 

(b) Exception. Any Member or Fellow may make a single payment which will be 
accepted by the Institute in place of all succeeding yearly dues and which will not otherwise 
alter his status as a Member or Fellow. The amount of this payment will depend upon 
the age of this Member or Fellow and will be based upon a suitable table and rate of inter¬ 
est, to be specified by the Board of Directors. 

(c) Exception. Any Member or Junior Member of the Institute serving, except as a 
commissioned officer, in the Armed Forces of the United States or of one of its allies, may 
upon notification to the Secretary-Treasurer be excused from the payment of dues until the 
January first following his discharge from the Service. He shall have all privileges of 
membership except that he shall not receive the Official Journal. However during the first 
year of his resumed regular membership he may have the right to purchase, at $2.60 per 
volume, one copy of each volume of the Official Journal published during the period of his 
service membership. 

2. Annual dues shall be payable on the first day of January of each year. 

3. The annual dues of a Fellow, Member, or Junior Member include a subscription to the 
Official Journal. The annual dues of a Sustaining Member include two subscriptions to 
the Official Journal. 

4. It shall be the duty of the Secretary-Treasurer to notify by mail anyone whose dues 
may be six months in arrears, and to accompany such notice by a copy of this Article. If 
such person fail to pay such dues within three months from the date of mailing such notice, 
the Secretary-Treasurer shall report the delinquent one to the Board of Directors, by whom 
the person’s name may be stricken from the rolls and all privileges of membership with¬ 
drawn. Such person may, however, be re-instated by the Board of Directors upon pay¬ 
ment of the arrears of dues. 
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ARTICLE III 
Salabiss 

1. The Institute shall not pay a salary to any Officer, Director, or member of any com¬ 
mittee. 


ARTICLE IV 

Amendments 

1. These By-Laws may be amended in the same manner as the Constitution or by a 
majority vote at any regularly convened meeting of the Institute, if the proposed amend¬ 
ment has been previously approved by the Board of Directors. 
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A. Introduction 

By a sequential test of a statistical h 3 rpothesis is meant any statistical test 
procedure which gives a specific rule, at any stage of the experiment (at the 
n-th trial for each integral value of n), for making one of the following three 
decisions: (1) to accept the hypothesis being tested (null hypothesis), (2) to 
reject the null hypothesis, (3) to continue the experiment by making an addi¬ 
tional observation. Thus, such a test procedure is carried out sequentially. 
On the basis of the first trial, one of the three decisions mentioned above is made. 
If the first or the second decision is made, the process is terminated. If the 
third decision is made, a second trial is performed. Again on the basis of the 
first two trials one of the three decisions is made and if the third decision is 
reached a third trial is performed, etc. This process is continued until either 
the first or the second decision is made. 

An essential feature of the sequential test, as distinguished from the current 
test procedure, is that the number of observations required by the sequential 
test is not predetermined, but is a random variable due to the fact that at any 
stage of the experiment the decision of terminating the process depends on the 
results of the observations previously made. The current test procedure may 
be considered a limiting case of a sequential test in the following sense: For any 
positive integer n less than some fixed positive integer A, the third decision is 
always taken at the n-th trial irrespective of the results of these first n trials. 
At the iV-th trial either the first or the second decision is taken. Which decision 
is taken will depend, of course, on the results of the N trials. 

In a sequential test, as well as in the current test procedure, we may commit 
two kinds of errors. We may reject the null hypothesis when it is true (error 
of the first kind), or we may accept the null hypothesis when some alternative 
hypothesis is true (error of the second kind). Suppose that we wish to test the 
null hypothesis Ho against a single alternative hypothesis Hi , and that we want 
the test procedure to be such that the probability of making an error of the 
first kind (rejecting Ho when Ho is true) does not exceed a preassigned value a, 
and the probability of making an error of the second kind (accepting Ho when 
Hi is true) does not exceed a preassigned value /3. Using the current test pro¬ 
cedure, i.e., a most powerful test for testing Ho against Hi in the sense of the 
Neyman-Pearson theory, the minimum number of observations required by the 
test can be determined as follows: For any given number N of observations a 
most powerful test is considered for which the probability of an error of the first 
kind is equal to «. I-et 0(N) denote the probability of an error of the second 
kind for this test procedure. Then the minimum number of observations is 
equal to the smallest positive integer N for which I3{N) < 

In this paper a particular test procedure, called the sequential probability 
ratio test, is devised and shown to have certain optimum properties (see section 
4.7). The sequential probability ratio test in general requires an expected num¬ 
ber of observations considerably smaller than the fixed number of observations 
needed by the current most powerful test which controls the errors of the first 
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and second kinds to exactly the same extent (has the same a and ff) as the se- 
quential test. The sequential probability ratio test frequently results in a 
saving of about 50% in the number of observations as compared wkh the qur^ 
rent most powerful test. Another surprising feature of the sequential prob¬ 
ability ratio test is that the test can be carried out without determining any 
probability distributions whatsoever. In the current procedure the test can be 
carried out only if the probability distribution of the statistic on which the test 
is based is known. This is not necessary in the application of the sequential 
probability ratio test, and only simple algebraic operations are needed for carry¬ 
ing it out. Distribution problems arise in connection with the sequential prob¬ 
ability ratio test only if we want to make statements about the probability dis¬ 
tribution of the number of observations required by the test. 

This paper consists of two parts. Part I deals with the theory of sequential 
tests for testing a simple hypothesis against a single alternative. In Part II a 
theory of sequential tests for testing simple or composite hypotheses against 
infinite sets of alternatives is outlined. The extension of the probability ratio 
test to the case of testing a simple hypothesis against a set of one-sided alterna¬ 
tives is straight forward and does not present any difficulty. Applications to 
testing the means of binomial and normal distributions, as well as to testing 
double dichotomies are given. The theory of sequential tests of hypotheses 
with no restrictions on the possible values of the unknown parameters is, how¬ 
ever, not as simple. There are several unsolved problems in this case and it is 
hoped that the general ideas outlined in Part II will stimulate further research. 

Sections 5.2, 5.3 and 5.4 in Part II deal with the applications of the sequential 
probability ratio test to binomial distributions, double dichotomies and normal 
distributions. These sections are nearly self-contained and can be understood 
without reading the rest of the paper. Thus, readers who are primarily in¬ 
terested in these special cases of the sequential probability ratio test rather than 
in the general theory, may profitably read only the above mentioned sections. 
For the benefit of readers who lack a sufficient background in the mathematical 
theory of statistics the exposition in sections 6.2, 5.3 and 5.4 is kept on a fairly 
elementary level. 

It should be pointed sut that whenever the number of observations on which 
the test is based is for some reason determined in advance, for instance, if certain 
data are available from past history and no additional data can be obtained, then 
the current most powerful test procedure is preferable. The superiority of the 
sequential probability ratio test is due to. the fact that it requires a smaller ex¬ 
pected number of observations than the current most powerful test. This 
feature of the sequential probability ratio test is, however, of no value if the num¬ 
ber of observations is for some reason determined in advance. 

B. Historical Note 

To the best of the author’s knowledge the first idea of a sequential test, i.e., 
a test where the number of observations is not predetermined but is dependent 
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on the outcome of the observations, goes back to H. F. Dodge and H; G. Romig 
who proposed a double sampling inspection procedure [1]. In this double samp¬ 
ling scheme the decision whether a second sample should be drawn or not de¬ 
pends on the outcome of the observations in the first sample. ^ The reason for 
introducing a double sampling method was, of course, the recognition of the fact 
that double sampling results in a reduction of the amount of inspection as com¬ 
pared with “single” sampling. 

The double sampling method does not fully take advantage of sequential 
analysis, since it does not allow for more than two samples. A multiple sampling 
scheme for the particular case of testing the mean of a binomial distribution was 
proposed and discussed by Walter Bartky [2]. His procedure is closely related 
to the test which results from the application of the sequential probability ratio 
test to testing the mean of a binomial distribution. Bartky clearly recognized 
the fact that multiple sampling results in a considerable reduction of the average 
amount of inspection. 

The idea of chain experiments discussed briefly by Harold Hotelling [3] is also 
somewhat related to our notion of sequential analysis. An interesting example 
of such a chain of experiments is the series of sample censuses of area of jute in 
Bengal carried out under the direction of P. C. Mahalanobis [6]. The succes¬ 
sive preliminary censuses, steadily increasing in size, were primarily designed to 
obtain some information as to the parameters to be estimated so that an efficient 
design could be set up for the final sampling of the whole immense jute area in 
the province. 

In March 1943, the problem of sequential analysis arose in the Statistical 
Research Group, Columbia University,^ in connection with a specific question 
posed by Captain G. L. Schuyler of the Bureau of Ordnance, Navy Department. 
It was pointed out by Milton Friedman and W. Allen Wallis that the mere notion 
of sequential analysis could slightly improve the efficiency of some current most 
powerful tests. This can be seen as follows: Suppose that N is the planned 
number of trials and is a most powerful critical region based on N observa¬ 
tions. If it happens that on the basis of the first n trials (n < N) it is already 
certain that the completed set of N trials must lead to a rejection of the null 
hypothesis, we can terminate the experiment at the n-th trial and thus save some 
observations. For instance, if TU/v is defined by the ineciuality a:? + . . , + > c, 

and if for some n < N we find that xi + ... + Xn > c, we can terminate the 
process at this stage. Realization of this naturally led Friedman and Wallis to 
the conjecture that modifications of current tests may exist which take advantage 
of sequential procedure and effect substantial improvements. More specifically, 
Friedman and Wallis conjectured that a sequential test may exist that controls 
the errors of the first and second kinds to exactly the same extent as the current 

» The Statistical Research Group operates under a contract with the Office of Scientific 
Research and Development and is directed by the Applied Mathematics Panel of the 
National Defense Research Committee. 
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most powerful test, and at the same time requires an expected number of obsetra- 
tions substantially smaller than the number of observations required by the 
current most powerful test.* 

It was at this stage that the problem was called to the attention of the author 
of the present paper. Since infinitely many sequential test procedures exist, 
the first and basic problem was, of course, to find the particular sequential test 
procedure which is most efficient, i.e., which effects the greatest possible saving 
in the expected number of observations as compared with any other (sequential 
or non-sequential) test. In April, 1943 the author devised such a test, called 
the sequential probability ratio test, which for all practical pmposes is most 
efficient when used for testing a simple hypothesis Ho against a single alterna¬ 
tive Hi . 

Because of the substantial savings in the expected number of observations 
effected by the sequential probability ratio test, and because of the simplicity 
of this test procedure in practical applications, the National Defense Research 
Committee considered these developments sufficiently useful for the war effort 
to make it desirable to keep the results out of the reach of the enemy, at least for 
a certain period of time. The author was, therefore, requested to submit his 
findings in a restricted report [7] which was dated September, 1943.* In this 
report the sequential probability ratio test is devised and its mathematical theory 
is developed. In July 1944 a second report [8] was issued by the Statistical 
Research Group which gives an elementary non-mathematical exposition of 
the applications of the sequential probability ratio test, together with charts, 
tables and computational simplifications to facilitate applications. 

Independently of the developments here, G. A, Barnard [9] recognized the 
merits of a sequential method of testing, i.e., the possibility of a saving in the 
number of observations as compared with the current most powerful test. He 
also devised an interesting sequential test for testing double dichotomies, which 
differs from the one obtained by applying the sequential probability ratio test. 

Some further developments in the theory of the sequential probability ratio 
test took place in 1944. Extending the methods used in [7], C. M. Stockman 
[10] found the operating characteristic curve of the sequential probability ratio 
test applied to a binomial distribution. Independently of Stockman, Milton 
Friedman and George W. Brown (independently of each other) obtained the 
same result which can be extended to the normal distribution and a few other 
specific distributions, but is not applicable to more general distributions. The 
general operating characteristic curve for any sequential probability ratio test 
w’as derived by the author [11]. A few months later the author developed a 
general theorj^ of cumulative sums [4] which gives not only the operating char- 

* Bartky’s multiple sampling scheme [2] for testing the mean of a binomial distribution 
provides, of course, an example of such a sequential test (see, for example, the remarks on 
p. 377 in [2]). Bartky *s results were not known to us at that time, since they were published 
nearly a year later. 

* The material was recently released making the present publication possible. 
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acteristic curve for any sequential probability ratio test but also the character¬ 
istic function of the number of observations required by the test. 

The theory of the sequential probability ratio test as given in the present 
paper differs considerably from the exposition given in [7], since the new de¬ 
velopments in [4] have been taken into account. However, some tables and a 
few sections of the original report [7] are included in the present paper without 
any substantial changes. 

Part I. Sequential Test of a Simple Hypothesis Against a 
Single Alternative 

1. The Current Test Procedure 

Let X be a random variable. In what follows in this and the subsequent 
sections it will be assumed that the random variable X has either a continuous 
probability density function or a discrete distribution. Accordingly, by the 
probability distribution f{x) of a random variable X we shall mean either the 
probability density function of or the probability that X = x, depending upon 
whether X is a continuous or a discrete variable. Let the hypothesis Ho to be 
tested (null hypothesis) be the statement that the distribution of X is fo{x). 
Suppose that Hq is to be tested against the single alternative hypothesis Hi that 
the distribution of X is given by/i(x). 

According to the Neyman-Pearson theory of testing hypotheses a most power¬ 
ful critical region for testing Ho against Hi on the basis of N independent 
observations xiy • • • , xjv on X is given by the set of all sample points , • * • , 
Xif) for which the inequality 

IN fl(Xl)fliX2) ■ ■ ■ flixjv) ^ , 

Mxi)Mx^) ■ ■ ■ Mx^) - 

is fulfilled. The quantity k on the right hand side of (1.1) is a constant and is 
chosen so that the size of the critical region, i.e., the probability of an error of 
the first kind should have the required value a. 

For a fixed sample size N the probability /3 of an error of the second kind is a 
single valued function of a, say /3;v(a), if a most powerful critical region is used. 
Thus, if in addition to fixing the value of a it is required that the probability of 
an error of the second kind should have a preassigned value 0, or at least it should 
not exceed a preassigned value jS, we are no longer free to choose the sample size 
N, The minimum number of observations required by the test satisfying these 
conditions is equal to the smallest integral value of N for which 0tf{a) < 0. 

Thus, the current most powerful test procedure for testing Ho against Hi can 
be briefly stated as follows: We choose as critical region the region defined by 
(1.1) where the constant k is determined so that the probability of an error of 
the first kind should have a preassigned value a and N is equal to the smallest 
integer for which the probability of an error of the second kind does not exceed 
a preassigned value 0, 
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2 . The SequentM Teat Procedure: Gmeral Deflnkbna 

2.1. Notion of a sequential test In current teste of hypotheses the number of 
observations is treated as a constant for any particular problem. Ih sequeniial 
tests the number of observations is no longer a constant, but a random variable.. 
In what follows the symbol n is used for the number of observations required by 
a sequential test and the symbol N is used when the number of obsetVatiops is 
treated as a constant. 

Sequential tests can be described as follows: For each positive integer m the 
w-dimensional sample space Mm is subdivided into three mutually exclusive 
parts Rm , Rm and Rm • After the first observation xi has been drawn Hq is 
accepted if Xi lies in JRj, /fo is rejected (i.e., Hi is accepted) if Xi lies in Rl , or a 
second observation is drawn if xi lies mRi. If the third decision is reached and 
a second observation X 2 drawn, Ho is accepted, Hi is accepted, or a third observa¬ 
tion is drawn according as the point (xi , X 2 ) lies in Rl , Rl or in R 2 . If (xi , X 2 ) 
lies in iE 2 , a third observation Xz is drawn and one of the three decisions is made 
according as (xi , X 2 , xz) lies in , R\ or in Rz , etc. This process is stopped 
when, and only when, either the first decision or the second decision is reached. 
Let n be the number of observations at which the process is terminated. Then 
n is a random variable, since the value of n depends on the outcome of the 
observations. (It will be seen later that the probability is one that the sequential 
process will be terminated at some finite stage.) 

We shall denote by Eo{n) the expected value of n if LTo is true and by Ei{n) 
the expected value of n if Hi is true. These expected values, of course, depend 
on the sequential test used. In order to put this dependence in evidence, we 
shall occasionally use the symbols Eo{n | S) and Ei{n | S) to denote the values 
Eo{n) and Ei{n)j respectively, when the sequential test S is applied. 

2.2. Efficiency of a sequential test. As in the current test procedme, errors of 

two kinds may be committed in sequential analysis. We may reject Ho when 
it is true (error of the first kind), or we may accept Ho when Hi is true (error of 
the second kind). With any sequential test there will be associated two num¬ 
bers a and /S between 0 and 1 such that if Ho is true the probability is a that we 
shall commit an error of the first kind and if Hi is true, the probability is 0 that 
we shall commit an error of the second kind. We shall say that two sequential 
tests S and S' are of equal strength if the values a and 0 associated with S are 
equal to the corresponding values a' and associated with S'. If a < a' and 
0 < jS', or if a < a' and < /3', we shall say that S is stronger than S'iS' is 
weaker than S). If a > a' and 0 < fi', or if a < a' and we shall say 

that the strength of S is not comparable with that of S'. 

Restricting ourselves to sequential tests of a given strength, we want to make 
the number of observations necessary for reaching a final decision as small as 
possible. If S and S' are two sequential tests of equal strength we shall say 
that S' is better than S if either Eo(n | S') < Eo{n 1 S) and Ei(n | S') < Ei 
(n I /S), or Eo(n | S') < Eoin | S) and Eiijn | S') < Ei{n | S). A sequential test 
will be said to be an admissible test if no better test of equal strength exists. 
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If a sequ^tial test S satisfies both inequalities Eo(n \ S) < E^{n | #S') and E\ 
(n I <S) < E\{n I S*) for any sequential test 5' of strength equal to that of S, then 
the test S can be considered to be a best sequential test. That such tests exist, 
i.e., that it is possible to minimize E^{n) and Ei{n) simultaneously, is not proved 
here; but it fe shown later (section 4,7) that for the so called sequential prob¬ 
ability ratio test defined in section 3.1 both E^{n) and E\{n) are very nearly 
minimized.^ Thus, for all practical purposes the sequential probability ratio 
test can be considered best. 

Since it is unknown that a sequential test always exists fbr which both E^{n) 
and Ei{n) are exactly minimized, we need a substitute definition of an optimum 
test. Several substitute definitions are possible. We could, for example, re¬ 
quire that the test be admissible and the maximum of the two values E^^iri) and 

Ei{n) be minimized, or that the mean —^ or some other weighted 

A 

average be minimized. All these definitions are equivalent if a sequential test 
exists for which both Efi{n) and Ei{n) are minimized; but if they cannot be mini¬ 
mized simultaneously the definitions differ. W’hich of them is chosen is of no 
significance for the purpose of this paper, since for the sequential probability 
ratio test proposed later both expected values E^iin) and Ei{n) are, if not exactly, 
very nearly minimized. If we had a priori knowledge as to how frequently /fo 
and how frequentlj'^ H\ will be true in the long run, it would be most reasonable 
to minimize a weighted average (weighted by the frequencies of Ho and H \, 
respectively) of Ho(n) and £'i(n). How'ever, when such knowledge is absent, 
as is usually the case in practical applications, it is perhaps more reasonable to 
minimize the maximum of Ho(n) and Ei{n) than to minimize some weighted 
average of E^{n) and Ei(n), Hence the following definition is introduced. 

A sequential test is said to be an optimum test if S is admissible and Max 
[Eo(n I ^), Ei(n | S)] < Max [Eo{n | S'), Ei(n j S')] for all sequential tests S' of 
strength equal to that of S. 

By the efficiency of a sequential test S is meant the value of the ratio^ 

Max [Ho(n 1 ^*), Ei(n | S*)] 

Max [Eo{n \ S), Ei(n 1 S)] 

where 5* is an optimum sequential test of strength equal to that of S. 

2.3. Efficiency of the current ‘procedure, viewed as a particular case of a sequential 
test. The current test procedure can be considered as a particular case of a 
sequential test. In fact, let N be the size of the sample used in the current pro¬ 
cedure and let Ws be the critical region on which the test is based. Then the 

* The author conjectures that £^o(n) and Ei{n) are exactly minimiaed for the sequential 
probability ratio test, but he did not succeed in proving this, except for a special class of 
problems (see section 4.7). 

* The existence of an optimum sequential test is not essential for the definition of effi¬ 
ciency, since Max [£^o(n | S*), Ei{n | ^’*)J could be replaced by the greatest lower bound of 
Max I£'(on I Ei{n I aS')] with respect to all sequential tests S' of strength equal to that 
of S, 
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current procedure can be considered as a sequential test defined as foliqws: For 
all m <N, the regions /?J,, are the empty subsets of the mKhmensional sample 
space Mn , and Rm — Mm . For m ^ N, R]f is equal to Wx , is equal to the 
complement of Wn and Rs is the empty set. Thus, for the cuit^t pro¬ 
cedure we have Eo(n) = Ei(n) = N. 

It will be seen later that the efficiency of the-current test based on the most 
powerful critical region is rather low. Frequently it is below In other words, 
an optimum sequential test can attain the same a and as the current most 
powerful test on the basis of an expected number of observations much smaller 
than the fixed number of observations needed for the current most powerful test. 

In the next section we shall propose a simple sequential test procedure, called 
the sequential probability ratio test, which for all practical purposes can be con¬ 
sidered an optimum sequential test. It will be seen that these sequential tests 
usually lead to average savings of about 50% in the number of trials as compared 
with the current most powerful test. 


3. Sequential Probability Ratio Test 

3.1. Definition of the sequential prohahility ratio test. We have seen in section 
2.1 that the sequential test procedure is defined by subdividing the m-dimensional 
sample space Mm (m = 1, 2, * • • , ad inf.) into three mutually exclusive parts 
Rm , Rm and Rm . The sequential process is terminated at the smallest value n 
of m for which the sample point lies either in R^n or in . If the sample point 
lies in R^n we accept Ho and if it lies in R\ we accept Hi . 

An indication as to the proper choice of the regions Rm , -Rm and Rm can be 
obtained from the following considerations: Suppose that before the sample is 
drawn there exists an a priori probability that Ho is time and the value of this 
probability is known. Denote this a priori probability by go . Then the a priori 
probability that Hi is true is given by gi = 1 — gfo, since it is assumed that the 
hypotheses Hq and Hi exhaust all possibilities. After a number of observations 
have been made we gain additional information which will affect the probability 
that Hi {i = 0,1) is time. Let gom be the a posteriori probability that Ho is true 
and Qim the a posteriori probability that Hi is true after m observations have been 
made. Then according to the well known formula of Bayes we have 


(3.1) 

and 


Qom “ 


^0 > 


• - ,Xm) + QlPlmiXly ••• ,Xm) 


_ giVlfnjXu ,Xm) 

gopQmiXi, ••• ,Xm) + giPlm{Xl y 


• • ) Xm) 


where pim(xi , • • • , Xm) denotes the probability density in the w-dimensisnal 
sample space calculated under the hypothesis Hi (i = 0, 1).® As an abbrevia¬ 
tion for pim(xi , • • • , x«) we shall use simply p,m . 


«If the probability distribution is discrete Pim(xi , • • • , Xm) denotes the probability that 
the sample point (a;i , • • • , «w) will be obtained. 
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Let do and di be two positive numbers less than 1 and greater than Suppose 
that we want to construct a sequential test such that the conditional probability 
of a correct decision under the condition that Ho is accepted is greater than or 
equal to do , and the conditional probability of a correct decision under the 
condition that Hi is accepted is greater than or equal to di / Then the following 
sequential process seems reasonable: At each stage calculate gom and gim . If 
gim > di , accept Hi . If pom > do, accept Ho . If gim < di and gom < do y draw 
an additional observation, fii in this sequential process is thus defined by the 
inequality gom>doy Rm by the inequality gm > di , and by the simultaneous 
inequalities gim < di and gom < do* It is necessary that the sets Rm , RL and 
i?m be mutually exclusive and exhaustive. For this it suffices that the in- 


> di 


equalities 


(3.3) 

„ __ OlPlm 

gim , 

go POm + <7l Pli 

and 


(3.4) 

„ _ gopom 

gom , 

goPom -T giPh 


> do 


be not fulfilled simultaneously. To show that (3 3) and (3.4) are incompatible, 
we shall assume that they are simultaneously fulfilled and derive a contradiction 
from this assumption. The two inequalities sum to 


(3.5) 


gim H" gom ^ di ”f" do 


Since gom + fifim = 1, we have 


i ^ di -|- do 


which is impossible, since by assumption d< > J (z = 0,1). Hence it is proved 
that the sets Rm , RL and Rm are mutually exclusive and exhaustive. 

The inequalities (3.3) and (3.4) are equivalent to the folloAving inequalities, 
respectively; 


(3.6) 

and 


Plm ^ 01 di 
POm 1 dl 


(3.7) 


Plm ^ 0 1 ““ do 
POm do 


The constants on the right hand sides of (3.6) and (3.7) do not depend on m. 

If an a priori probability of Ho does not exist, or if it is unknown, the inequali¬ 
ties (3.6) and (3.7) suggest the use of the following sequential test: At each stage 


^ The restriction do > 1/2 and di > 1/2 are imposed because otherwise it might happen 
that the h3rpothesis with the smaller a posteriori probability will be accepted. 
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calculate pim/p»m. If pim = Pa» 0, the value of the ratio pim/ptm is de&ied 
to be equal to 1. Accept ffi if 

(3.8) ^>A. 

POm 

Accept Ho if 

(3.9) ?^<B. 

POm 

Take an additional observation if 

(3.10) B<?^<A. 

POm 

Thus, the number n of observations required by the test is the smallest integral 
value of m for which either (3.8) or (3.9) holds. The constants A and B are 
chosen so that 0 < B < A and the sequential test has the desired value a of the 
probability of an error of the first kind and the desired value of the probability 
of an error of the second kind. We shall call the test procedure defined by (3.8), 
(3.9) and (3,10), a sequential probability ratio test. 

The sequential test procedure given by (3.8), (3.9) and (3.10) has been justi¬ 
fied here merely on an intuitive basis. Section 4.7, however, shows that for this 
sequential test the expected values Ho(n) and Ei{n) are very nearly minimized.® 
Thus, for practical purposes this test can be considered an optimum test. 

3.2. Fundamental relations among the quantities a, /3, A and B, In this section 
the quantities a, 0, A and B will be related by certain inequalities which are of 
basic import-ance for the sequential analysis. 

Let {xm\{m = 1,2, • • • , ad inf.) be an infinite sequence of observations. The 
set of all possible infinite sequences {.r^} is called the infinite dimensional sample 
space. It will be denoted by M^ . Any particular infinite sequence {Xm] is 
called a point of ill ^. For any set of n given real numbers ai, • • • , Un we shall 
denote by C(ai, • • • , On) the subset of M ^ which consists of all points (infinite 
sequences) |a:m) {m = 1,2, • • • , ad inf.) for which xi = ax, • • • , Xn = Un . For 
any values oi, • • • , a„ the set C{ai y an) 'vill be called a cylindric point of 
order n. A subset aS of will be called a cylindric point, if there exists a posi¬ 
tive integer n for w^hich S is a cylindric point of order n. Thus, a cylindric point 
may be a cylindric point of order 1, or of order 2, etc. A cylindric point C(ai, 

• • • , a„) will be said to be of type 1 if 

Pin _ /l(fll)/l(g2) ■ • ’ /l(gn) ^ ^ 

Pon /o(ai)/o(a 2 ) • • • Man) “ 


‘ It seems likely to the author that Eatjn) and Ei(n) are exactly minimized for the se¬ 
quential probability ratio test. However, he did not succeed in proving it, except for a 
special class of problems (see section 4.7). 
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and 


^ <gl» _ /l(gO ^ ^ 

POm /o(ai) • • • foiOm) 

A cylindric point C(ai, • • • , On) will be said to be of t 3 rpe 0 if 

Pin ^ /l(Ol) - ♦ • /l(gn) ^ ^ 

POn foifll) • • • /©(On) 

and 


1 , • 


n — 1) 


B < gj? = ^ (,» 

Pt>m /o(fll) • • • /o(Ow) 


1 , 


•, n - 1). 


Thus, if a sample (a;i, • • • , Xn) is observed for which C{xi , • • • , Xn) is a cylindric 
point of type i, the sequential test defined by (3.8), (3.9) and (3.10) leads to the 
acceptance of Hi (i = 0, 1). 

Let Qi be the sum of all cylindric points of type i (i = 0, 1). For any subset 
M oi M ^ we shall denote by P<(M) the probability of M calculated under the 
assumption that Hi is true (t = 0,1). Now we shall prove that 


(3.11) 


Pi(Qo + Qi) = 1 


(i == 0, 1) 


This equation means that the probability is equal to one that the sequential 
process will eventually terminate. To prove (3.11) we shall denote the variate 


by 2 , and Zi + 

fo{Xi) 


+ Zm by Zm (if w = 1, 2, • • • , ad inf.). Further¬ 


more, denote by n the smallest integer for which either Z„ > log A or Zn < 
log B, If no such finite integer n exists we shall say that n = ». Clearly, n is 
the number of observations required by the sequential test and (3.11) is proved 
if we show that the probability that n = oo is zero. But the latter statement 
was proved by the author elsewhere (see I.emma 1 in [4]). Hence equation 
(3.11) is proved. 

With the help of (3.11) we shall be able to derive some important inequalities 
satisfied by the quantities a, /3, A and B. Since for each sample (.Ti , • • • , Xn) 
for which C(xi, • • • , x,*) is an element of Qi the inequality pin/pon > A holds, 
we see that 


(3.12) 


Pi(Qi) > APo(Qi) 


Similarly, for each sample (xi, • • • , Xn) for which C(xi, • • • , Xn) is a point of 
Qo the inequality pin/pon < B holds. Hence 

(3.13) Pi(Oo) < PPo(Oo). 


But Po(Qi) is the probability of committing an error of the first kind and Pi(Qo) 
is the probability of making an error of the second kind. Thus, we have 

(3.14) Po(Qi) = a; Pi(Qo) = 0. 
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Since Qe and Qrare disjoint, it follows from (3.11) that 

(3.16) PoiQo) = 1 ~ a; Pi«?i) 

From the relations (3.12)-(3.15) we obtain the important inequalities 

(3.16) 1 - ff>Aa 
and 

(3.17) /3 < B (1 - a). 

These inequalities can be written as 


(3.18) 
and 

(3.19) 


<1 

1 -iS-A 




1 - a 


< B. 


The above inequalities are of great value in practical applications, since they 
supply upper limits for a and 0 when A and B are given. For instance, it follows 
immediately from (3.18) and (3.19), and the fact that 0<a<l,0</5<l that 

(3.20) a<^ 


fi<B. 


and 

(3.21) 

A pair of values a and (5 can be represented by a point in the plane with the 
coordinates a and /3. It is of interest to determine the set of all points (a, jJ) 
which satisfy the inequalities (3.18) and (3.19) for given values of A and B. 
Consider the straight lines Li and L 2 in the plane given by the equations 


(3.22) 

and 


Aa = 1 - /3 


0 = B(1 — a), 


The line L\ intersects the abscissa axis at a = and the ordinate 

A 


(3.23) 

respectively. 

axis at ^ ~ 1. The line Lo intersects the abscissa axis at a = 1 and the ordinate 
axis at |(3 = B. The set of all points (a, ff) which satisfy the inequalities (3.18) 
and (3.19) is the interior and the boundary of the ciuadrilateral determined by 
the lines L \, L 2 and the coordinate axes. This set is represented by the shaded 
area in figure 1. 

The fundamental inequalities (3.18) and (3.19) were derived under the assump¬ 
tion that Xi, X 2 , * * • , ad inf. are independent observations on the same random 
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variable X. The independenoe of the observations is, however, not necessary 
for the validity of (3.18) and (3.19). In fact, the independence of the observa¬ 
tions was used merely to show the validity of (3.11). But (3.11) can be shown 
to hold also for dependent observations under very general conditions. Hence, 
if Hi states that the joint distribution of , ^ 2 , • • • , rm is given by the joint 
probability density function Pimixi , • • * , XmY (^ = 0, 1; m — 1,2, • •' , ad inf.) 
and if (3.11) holds, then for the sequential test of Hq against Hi , as defined by 
(3.8), (3.9) and (3.10), the inequalities (3.18) and (3.19) remain valid. For 
instance, let Xo and Xi be two different positive values < 1 and let Hi(i == 0, 1) 
be the hypothesis that the joint probability density function of xi ,*••, o^m is 
given by 



PimiXi , 



y-2 


(^ = 0 , 1 ) 


i.e., that Xi and (x, — \iXj-i)(j = 2, 3, • • • , ad inf.) are normally and inde¬ 
pendently distributed with zero means and unit variances, then the inequalities 
(3.18) and (3.19) will hold for the sequential test defined by (3.8), (3.9) and 
(3.10). 

3.3. Determination of the values A and B in 'practice. Suppose that we wish 
to have a sequential test such that the probability of an error of the first kind is 
equal to a and the probability of an error of the second kind is equal to /3. De- 


* Of course, for any positive integers m and m' with m < tn* the marginal distribution of 
* 1 , • • • , *w determined on the basis of the joint distribution , • • • , »«') must be 

equal to P,«(®i, * • • , Xm). 
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m 

note by A(a, fi) and B(o, 0) the values of A and B for which the probabiliti^ of 
the errors of the first and second kinds will take the desired a and 0. 

The exact determination of the values A (a, $) and B(a, 0) is rather laborious, as 
will be seen in Section 3.4. The inequalities at our dispel, however, permit Idxe 
problem to be solved satisfactorily for practical purposes. From (3.18) and 
(3.19) it follows that 

(3.24) A(a, /J) < * 

a 

and 


(3.25) 

Suppose we put A = - - = a(a, /3) (say), and B = ^— = 6(o!, /9) (say). 

a 1 — ot 

Then A is greater than or equal to the exact value A (a, 0), and B is less than or 
equal to the exact value B(a, /3). This procedure, of course, changes the prob¬ 
abilities of errors of the first and second kind. If we were to use the exact value 
of B and a value of A which is greater than the exact value, then evidently we 
would lower the value of a, but slightly increase the value of /S. Similarly, if 
we were to use the exact value of A and a value of B w^hich is below the exact 
value, then we would lower the value of j8, but slightly increase the value of a. 
Thus, it is not clear what will be the resulting effect on a and jS if a value of A is 
used which is higher than the exact value, and a value of B is used which.is lower 
than the exact value. Denote by a' and 0' the resulting probabilities of errors 

. 1 — d , « jS 

of the first and second kind, respectively, if we put A = —~— and B = • 

We now derive inequalities satisfied by the quantities a', a and 0. Sub¬ 
stituting a(a, 0) for A, h{a, 0) for B, a' for a and 0' for 0 we obtain from (3.18) 
and (3.19) 


(3.26) 


a ^ 1 a 

1 - 0' ~ ala;0) " 1 ^ 0 


and 


From these inequalities it follows that 


(3.28) 

and 




(3.29) 
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Multiplying (3*26) by (1 — p)(l — /9') and (3.27) by (1 — a)(l — a') and adding 
the two resulting inequalities, we have 


(3.30) 


+ /3' < a + /3. 


Thus, we see that at least one of the inequalities a' < a and < 0 must hold. 
In other words, by using a(«, 0) and 5(a, 0) instead of A (a, 0) and J5(a, /3), re¬ 
spectively, at most one of the probabilities a and 0 may be increased. 

If a and 0 are small (say less than .05), as they frequently will be in practical 

a , 0 


applications, 


. and 


are nearly equal to a and /9, respectively. Thus, 


— p I — a 

we see from (3.28) and (3.29) that the quantity by which a' can possibly exceed 
a, or 0' can exceed 0, must be small. Section 3.4 contains further inequalities 
which show that the amount by ^vhich a'{0') can possil^ly exceed a{0) is indeed 
extremely small. Thus, for all practical purposes a' < a and 0' < 0. 

If /i(^) (the distribution under the alternative hypothesis) is sufficiently near 
foix) (the distribution under the null hypothesis), A (a, 0) and B(a, 0) will be 
I — B 0 

nearly equal to - - and -, respectively; and consequently a' and 0' are 

a 1 “ a 

also ver}’’ nearly equal to a and 0 respectively. The reason that (3.18) and 
(3.19) and therefore also (3.24) and (3.25) are inequalities instead of equalities 

is that the sequential process may terminate with ^ > A or < B. If at 

POn pOn 

the final stage were exactly equal to A or B, then A (a, 0) and B(a, 0) would 


be exactly 


1 - 


and 


1 — a 


, respectively. If fi(x) is near /o(x), it is almost 


certain that the value of is changed only slightly by one additional observa- 

Bon 

tion. Thus, at the final stage ^ will be only slightly alcove A, or slightly below 

Bon 


B and consequently A {a, 0) and B(a, 0) wiW be nearly equal to 


0 


and 


0 


a 1 - a’ 

respectively. If fractional observations were possible, that is to say, if the num- 

p 

her of observations were a continuous variable, would also be a continuous 

lOm 

function of m and consequently A(«, 0) and B{a, 0) would be exactly equal to 


and 


0 


, respectively. Thus, we have inequalities in (3.24) and (3.25) 


1 -0 

- aim , 

a 1 — a 

ii^stead of equalities merely on account of the fact that the number m of observa¬ 
tions is discontinuous, i.e., m can take only integral values. 

Hence for all practical purposes the following procedure can be adopted; To 
construct a sequential test such that the 'probability of an error of the first kind does 
not exceed a and the probability of an error of the second kind does not exceed P, put 
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1 — d 

A * —— and B = ^ sequential test as defined by Ote in- 

equalities (3.8), (3.9) and (3.10). 

In most practical cases the calculation of the exact values A{a^ ff) and B(a, /8) 
will be of little interest for the following reasons: When A = a(a, B) ^ - - 


and B == 6 (a, B) 


B 


1 - a 


, the probability a' of an error of the first kind cannot 


exceed a and the probability /3' of an error of the second kind cannot exceed jS, 
except by a very small quantity which can be neglected for practical purposes. 
Thus, for all practical purposes the use of a(a, B) and h{a, B) instead of A (a, B) 
and B(a, B) will not decrease the strength of the sequential test. The only 
possible disadvantage from the substitution is that it may increase the expected 
number of trials necessary for a decision. Since the discrepancy between A{a, B) 
and B{a, B) on the one hand and a(a, B) and b(a, B) on the other, arises only 
from the discontinuity of the number m of observations, it is clear that the in¬ 
crease in the expected number of trials caused by the use of a(a, B) and b(a, B) 
will be slight. This slight increase, however, cannot be considered entirely a 
loss for the following reason: if a(a, B) > A{a^ B) or 6(a, B) < ^(a, B), then we 
can sharpen the inequality (3.30) to a' + ^' < a + /?. Hence by using a(a, B) 
and b(a, B) we gain in strength. 

The fact that for practical purposes we may put A = a(a, B) and B = 
b(a, B) brings out a surprising feature of the sequential test as compared with 
current tests. While current tests cannot be carried out without finding the 
probability distribution of the statistic on which the test is based, there are no 
distribution problems in connection with sequential tests. In fact, a(a, B) and 

b(a, B) depend on a and B only, and the ratio — can be calculated from the data 

POm 

of the problem without solving any distribution problems. Distribution prob¬ 
lems arise in connection with the sequential process only if it is desired to find the 
probability distribution of the number of trials necessarj^ for reaching a final 
decision. (This subject is discussed later.) But this is of secondary importance 
as long as we know that the sequential test on the average leads to a saving in 
the number of trials. 

3.4. Prohahility of accepting Ho {or Hi) when some third hypothesis H is true. 
In Section 3.2 we were concerned with the probability that the sequential prob¬ 
ability ratio test will lead to the acceptance of Hq (or Hi) when Hq or Hi is true. 
Since in Part II we shall admit an infinite set of alternatives, and since this is 
the practically important case, it is of interest to study the probability of accept¬ 
ing Hq (or Hi) when any third hypothesis H, not necessarily equal to Ho or Hi , 
is true. Let H be the hypothesis that the distribution of X is given by /(a?). 
If/(a;) is equal to/o(a;) or/i(a’) we have the special case discussed in Section 3.2. 
In what follows in this and the subsequent sections any probability relationship 
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will be stated on the assumption that H is true, unless a statement to con¬ 
trary is explicitly made. Denote by 7 the probability that the sequential prob¬ 
ability ratio test will lead to the acceptance of Hi Clearly, if = /To, then 
7 ~ a and if = i/i , then 7 = 1 — iS. 

The probability 7 can readily be derived on the basis of the general theory of 

fiix •) * 

cumulative sums given in [4]. Denote log by Zi . Then {zi} (i = 2 , • • • , 


ad inf.) is a sequence of independent random variables each having the same dis¬ 
tribution. Denote by Zj the sum of the first j elements of the sequence { 2 ;»} i.e., 


(3.31) 


Zj ^ Zi + • • • + Zj 


ij = 1 , 2 , ••• , ad inf.)’ 


For any relation R we shall denote by P(R) the probability that R holds. For 
any random variable Y the symbol EY will denote the expected value of Y. 
Let n be the smallest positive integer for which either Z„ > log A or Z„ < log B 
holds. If log J5 < Zto < log A holds for m = 1 , 2 , • • • , ad inf., we shall say that 
n = 00 . Obviously, n is the number of observations required by the sequential 
probability ratio test. As we have seen in Section 3.3, in practice we shall put 
1 d d 

A = a(a, d) = - - and B = b(a, d) = ;;-• Since B must be less than A, 

a I •— a 

1 —• d d 

we shall consider only values a and d for which- > ;;-. This inequality 

a 1 — a 

is equivalent to a + d < 1> which in turn implies that B < 1 and A > 1. Thus, 
in all that follows it will be assumed that A > 1 and B < 1 . We shall also 
assume that the variance of Zi is not zero. 

According to Lemma 1 in [4] the relation P{n = 00 ) = 0 holds. Hence, the 
probability is equal to one that the sequential process will eventually terminate. 
This implies that the probability of accepting Ho is equal to 1 — 7 . 

Let z be a random variable whose distribution is equal to the common dis¬ 
tribution of the variates Zi(i = 1 , 2 , • • • , ad inf.), jpenote by (p{t) the moment 
generating function of i.e., 

^(0 = Ee^\ 


It was shown in [4] that under very mild restrictions on the distribution of z 
there exists exactly one real value h such that h 5 ^ Oand (p{h) = 1. Furthermore, 
it was shown in [4] (see equation (16) in [4]) that 

(3.32) = 1. 

Let E* be the conditional expected value of under the restriction that Ho 
is accepted, i.e., that Zn < log B, and let E** be the conditional expected value 
of under the restriction that Hi is accepted, i.e., that Zn > log A. Then we 
obtain from (3.32) 


(3.33) 


(1 - y)E* + yE^* = 1 


The probability that Ho will be accepted is equal to 1 — 7 , aa will be seen later. 
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Solving for 7 we obtain 
(3.34) 


1 - E* 

^ - E*' 


If both the absolute value of Ez and the variance of z are small, which will be the 
case when fi{x) is near U{x), E* and E*"^ will be nearly equal to ^ and re¬ 
spectively. Hence, in this case a good approximation to 7 is given by thd ex¬ 
pression 


(3.35) 


7 


1 ~ 


It is easy to verify that /i = 1 if H = Ho, and h — —lifH = Hi. The differ¬ 
ence 7—7 approaches zero if both the mean and the variance of z converge to 
zero. 

To judge the goodness of the approximation given by 7 , it is desirable to de¬ 
rive lower and upper limits for 7. Such limits for 7 can be obtained by deriving 
lower and upper limits for E* and B**. First we consider the case when h > Q, 
Let f be a real variable restricted to values > 1, and let p be a positive variable 
restricted to values < 1 . For any random variable Y and any relationship R 
we shall denote hy E{Y \R) the conditional expected value of Y under the re¬ 
striction that R holds. It was shown in [4] that the following inequalities hold.:“ 

(3.36) B* jg.l.b. tE (e"" [ c*' < < E* < B’' (h > 0) 

and 

(3.37) < E** < A* |l.u.b. pE 1 e*' > (fe > 0). 


The symbol g.l.b. stands for the greatest lower bound with respect to f, and the 
r 

symbol l.u.b. stands for least upper bound with respect to p. Putting 

p 

(3.38) g.l.b. | e*' < , 

and 

(3.39) l.u.b. pE (e'" | e*' ^ = «. 


the inequalities (3.36) and (3.37) can be written as 

(3.40) < B* , (A > 0) 


See relations (23) and (26) in [4]. The notation used here is somewhat different from 
that in [4]. 
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and 

(3.41) (h>0). 


Since B < 1 and ^4 > 1, we see that E* < I and E** > 1 if h > 0, From 
this and the relations (3.34), (3.40) and (3.41) it follows easily that 


- nB^ 


1 


lih <0y limits for 7 can be obtained as follows: Let 2 ' = — 2 , A' = B' 


A* 


Then /i' = —/i > 0 and 7 ' « 1 — 7 . Thus, according to (3.42) we have 


(3.43) 


1 - (B')*' 




S'iA'f - (BO 




1 - n^(B0 

(A'r - v'iB'f 


where 5' and iy' are equal to the expressions we obtain from (3.38) and (3.39), 
respectively, by substituting h* for h and 2 ' for 2 . Since 1 ? and 5 depend only on 
the product hz = h'z% we see that 6 ' = 3 and ij' = vj. Hence, we obtain from 
(3.43) 


(3.44) 


1 - A* ^ ^ 1 - 1,-4* 

«B* - - B* - i,A* 


(/i < 0) 


where 6 and 17 are given by (3.38) and (3.39), respectively. 

In Section 3.5 we shall calculate the value of rj and B for binomial and normal 
distributions. If the limits of 7 , as given in (3.42) and (3.44), are too far apart, 
it may be desirable to determine the exact value of 7 , or at least to find a closer 
approximation to 7 than that given in (3.35). A solution of this problem is 
given in [4] (see section 7 of that paper). There the exact value of 7 is derived 
when 2 can take only a finite number of integral multiples of a constant d. If 2 
does not have this property, arbitrarily fine approximation to the value of 7 
can be obtained, since the distribution of 2 can be approximated to any desired 
degree by a discrete distribution of the type mentioned before if the constant d 
is chosen sufficiently small. The results obtained in [4] can be stated as follows: 
There is no loss of generality in assuming that d = 1 , since the quantity d can 
be chosen as the unit of measurement. Thus, we shall assume that 2 takes only 
a finite number of integral values. Let gi and g 2 be two positive integers such 
thatP (2 = —gi) andP (2 = g 2 ) are positive and 2 can take only integral values 
> — and <g 2 - Denote P (2 = i) by hi. Then the moment generating 
function of 2 is given by 

02 

<p{t) = 

Cl 

Put u ^ and let Wi, • • • Ug be the g — gi + g 2 roots of the equation of gf-th 
degree 

( 3 . 45 ) £ hiU* = 1 . 
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D^ote by [a] the smallest integer > log A, and by [b] the largest integer < log B, 
Then Zn can take only the values 

(3.46) [b] — + 1, [ 6 ] — + 2, • • • , [5], [a], [a] + 1, • • • , la] + — U 

Denote the g different integers in (3.46) by ci, * • • , , respectively. Let A be 
* the determinant value of the matrix \\uV \\ (f, i = 1 , • • • , gf) and let A^ be the 
determinant we obtain from A by substituting 1 for the elements in the j-th. 
colunm. Then, if A (V, the probability that Zn = Cy is given by 

(3.47) P(Z, C) = 

A 


(3.48) 


y = P(Zn > la]) 


Y> A,- 
Y A 


where the summation is to be taken over all vaues of j for which cj> [oj. 

3.5. Calculalion of S and ri for binomial and normal distrtbuiions. Let X be a 
random variable which can take only the values 0 and 1. Let the probability 
that X = 1 be p< if is true (i = 0, 1 ),andpif istrue. Denote 1 — pbyg 
and 1 - Pi by g, (i = 0,1). Then/i(l) = p< ;/,(0) = gi,/(l) = pand/(0) = q. 
It can be assumed without loss of generality that pi> pt. The moment generat- 

ing function of 2 = log*^^^ is given by 

MX) 

+»(!)'■ 

Let h 0 he the value of t for which <p(h) = 1 , i.e., 

First we consider the case when h > 0, It is clear that ^ ^ 

plies that x = 1. Hence > 1 implies thate** — ~ ’ From 

this and the definition of 8 given in (3,39) it follows that 


(3.49) 


-(gy 


Similarly, the inequality < 1 implies that == ( - 

\?o. 

definition of given in (3.38) it follows that 


(h > 0). 
From this and the 
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If ft < 0, it can be shown in a similar way that 


(3.61) 


(ft<0) 

and 



(3.62) 

II 

(ft < 0). 

Now we shall calculate the values of S and r? if X is normally distributed. Let 

(3.53) 


a = 0,1) 

and 



(3.54) 




We can assume without loss of generality that = — A and = A where A > 0, 
since this can always be achieved by a translation. Then 

(3.65) 2 = log^jl = 21 ^. 

The moment generating function of z is given by 

(3.56) ¥>(0 = 

Hence 

(3.57) ^ = 

Substituting this value of ft in (3.38) and (3.39) we obtain 

(3.58) « = l.u.b. pE{e-^‘ \ e"''" > 
and 

(3.59) ij = g.Lb. . 

For any relation R let P*iR) denote the probability that the relation R holds 
calculated under the assumption that the distribution of x is normal with mean 
6 and variance unity. Furthermore, let P**(R) denote the probability that R 
holds if the distribution of x is normal with mean — 6 and variance unity. Since 
is equal to the ratio of the normal probability density function with mean 
— $ and variance unity to the normal probability density function with mean S 
and variance unity, we see that 
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It can easily be verified that the right hand side expressions in (3.60) and 
(3.61) have the same values for 0 = X as for d = —X. Thus, also 5 and v have the 
same values for d = X as for ^ = — X. It will be, therefore, sufficient to compute 
d and 1 ? for negative values of 6, Let d = — X where X > 0. First we show that 

= “. Clearly 


(3.62) 




(1 < f < »). 


lotting f = -(0<p<l)m (3.62) gives 
P 


(3.63) 


fP** < Ij P** 

p* pp* 



Hence 



V = g-l-b. < 
f 

fP** 



1 

(3.64) 

P*(e»* < J) 


pp* 

G"*"* - i) 



1 \ s/ J 

l.u.b. < 

p 

put* 

\ p/ 

('"■ - j). 


Because of the symmetry of the normal distribution, it is easily seen that 
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Hence 


(3.65) 


1 

’'“i- 


1 r*‘ 

Now we shall calculate the value of 5. Denote Then 

P** ^ 0 = ( 2 ^ ^ >°8 0 

-f-(.>llog0.G(2Xlogl-x). 


Similarly 


> i) - p-(. >iiog0 . c(|. login- x). 

Denote “ log - by u. Since p can vary from 0 to 1, w can take any value from 

p 

0 to ». Since p = we have 


(3.66) i = l.u.b. 






We shall prove that 


(3.07) („y) 

is a monotonically de(^reaHing function of u and consequently the maximum is 
at w = 0. For this purpose it suffices to show that the derivative of log x(w) 
is never positive. Now 

(3.68) log x(w) = log G{u - X) - log G{u + X) - 2Xw. 

1 d> 

Denote ^ by 4>(a:). Since ^ G{u) = —^{u) it follows from (3.68) that 

(3.« 

It follows from the mean value theorem that the right hand side of (3.69) is 
d /^(u)\ 

never positive if ^ j ^ values of u. Thus, 

we need merely to show that 



d / ^{u) \ ^ <i>' 
du\GXu)/ 


SEQUENTIAL 

{u)G{u) - GXumu) 


_ i'(u)6(u) + 4 >*( m ) _ $*(«) *( m ) ^ . 

G2(u) GH«) 0(u) - 


Denote p^r by y. The roots of the equation y — uy — 1 = 0 are 

U ± Vi? + 4 

y 2 

Hence the inequality y* — wy — 1 < 0 holds if and only if 


U — \/u^ + 4: 


<y< 


u + + 4 


Since y cannot be negative, this inequality is equivalent to 

(3.71) - y < -2- • 

Thus we have merely to prove (3.71). We shall show that (3.71) holds for 
all real values of w. Birnbaum has shown [5] that for w > 0 


Hence 


\/u^ + 4 — u 


G{u) \/w2 + 4 


Hu) < G(u), 


_ \/w‘^ + 4 + w 

_ 


{u > 0 ) 


which proves (3.71) for u > 0. Now we prove (3.71) for u < 0. Let w = — a 
where v > 0, Then it follows from (3.73) that 

(3.74) Pp{ < ”77-—-;-. 

G{v) \/4 + V- — V 

Taking reciprocals, we obtain from (3.74) 

(3.75) 

Hv) 2 

Since 

G{u) . G{v) + 2v^iv) _ G{v) , „, 


we obtain from (3.75) 


+ 4 + 3t > ^ + 4 + 

4>(w) ““ 2 ~ 2 


(3.76) 


V 
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Taking reciprocals, we obtain 

^(u) ^ 2 _\/v2 -j. 4 — r _ V+ 4 + u 

G(u) - Vv^ + 4 + v 2 2 

Hence (3.71) is proved for all values of u and consequently 5 is equal to the value 
of the expression (3.67) if we substitute 0 for u. Thus, 


(3.77) 


5 = 


0 (-X) 
G(\) • 


4. The Number of Observations Required by the Sequential Probability 

Ratio Test 

4.1. Expected number of observations necessary for reaching a decision. As 
before, let 


2 = , Zi = (i = 1,2, • • • , ad inf.) 

/o(^) /o(^i) 

and let n be the number of observations required by the sequential test, i.e., n is 
the smallest integer for which -f • • * 7 f- Sn is either >log A or <log B, 

To determine the expected value E{n) of n under any hypothesis H we shall 
consider a fixed positive integer N, The sum = ^l + • • • + 2 ^^ can be split 
in two parts as follows 

(4.1) Z^^ = Zn + Z'n 

where Zn = 2 n-(i + • • • + 2 iv if n < and Zl = Zat — Zn if n > AT. Taking 
expected values on both sides of (4.1) we obtain 

(4.2) NEz = EZn + EZn . 

Since the probability that n > N converges to zero as N oo, and since 
I Zn 1 < 2(log A + I log ^ 1 ) if n > A^, it can l)e seen that 

(4.3) lim [EZ'„ - E(N - n)Ez\ = 0. 


From (4.2) and (4.3) it follows that 


(4.4) 
Hence 

(4.5) 


EZn = EnEz . 


En 


EZr. 

Ez 


Let E*Zt, be the conditional expected value of Zn under the restriction that the 
sequential analysis leads to the acceptance of Hq , i.e. that Zn < log B. Simi¬ 
larly, let E**Zn be the conditional expected value of Zn under the restriction that 
Hi is accepted, i.e., that Zn > log A. Since 7 is the probability that Zn > log A, 
we have 


(4.6) 


EZn = (1 - y)E*Zn + yE**Zn . 
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IVom (4.5) and (4.6) we obtain 

(4.7) Fn - a -7)B»Z. + 7e“Z. 

Ez 

The exact value of EZn , and therefore also the exact value of En, can be com¬ 
puted if z can take only integral multiples of a constant d, since in this case the 
exact probability distribution of Zn was obtained (see equation (3.47)). If z 
does not satisfy the above restriction^ it is still possible to obtain arbitrarily fine 
approximations to the value of EZn , since the distribution of z can be approxi¬ 
mated to any desired degree by a discrete distribution of the type mentioned 
above if the constant d is chosen sufficiently small. 

If both I Ez 1 and the standard deviation of z are small, E*Zn is very nearly 
equal to log B and is very nearly equal to log A, Hence in this case we 

can write 


(4.8) En ~ y-Z>)JgggJ-_rlO-gi . 

Ez 

To judge the goodness of the approximation given in (4.8) we shall derive lower 
and upper limits for En by deriving lower and upper limits for E*Zn and E*'^Zn . 
Let r be a non-negative variable and let 

(4.9) { = Max E(z — r\z > r) (j^ > 0) 

r 

and 

(4.10) f' = Min £(2 + r 1 3 + r < 0). (r > 0) 

r 

It is easy to see that 

(4.11) log A < E**Zn < log A + ? 
and 

(4.12) . log 2? -I- r < E*Zn < log J5. 

We obtain from (4.7), (4.11) and (4.12) 


(1 - 7 )(log £ + {') + 7 log '4 ^ ^ (1 - y) log B + 7 (log A + i) 

(4.13) - Ei - - Tz - 

and if £* > 0 

(1 — y) log B + 7 (log A + ^) < < (1 - 7 )(log B + ^') + y log A 

(4.14) ~ 

if Ez < 0. 

4.2. CalculcUion of the quantities f and for binomial and normal distributions. 
Let X be a random variable which can take only the values 0 and 1. Let the 
probability that X = 1 be pi if Hi is true (t = 0, 1), and p if is true. Denote 
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1 — phy q and 1 - p, by qi (t = 0, 1). Then/,(1) = p,-, /,(0) = qt , /(X) = p 
and /(O) = q. It can be assumed without loss of generality that pi > po . It 

is clear that log > 0 implies that a: = 1 and consequently log = log 
J^{x) /oW 

/iro’''®-- 

(4.15) 


Po 


i = Max E{z — r 1 2 •> r) = log —. 
r Po 


,/iW‘ 


Since log’^rT-r < 0 implies that x = 0, we have 
h\x) 


(4.16) 


f' = Min E{z + r 1 2 + ^ < 0) = log 


Now we shall calculate the values ( and if X is normally distributed. Let 




(t = 0,1) (9, > e„) 


and 


f(x) = 


x/2r' 


-( a —^)*/2 


We may assume without loss of generality that 9o = —A and = A where 
A > 0, since this can always be achieved by a translation. Then 


(4.17) 




Denote e V)y ^(;r) and 


J c dt by G{x). I^et t = x B, 

Thenz = 2A{t + B) and 

Eiz — rl^ — r>0) = 2AE ^ ^ 


(4.18) 

where 

(4.19) 


' + »-2Sa 


») 


OA r* 9A 

Sk) 1.0 ~ ^ [-<»<?«») + ^«o)] 




Hto) 


In section 3.5 (see equation (3.70)) it was proved that is a monotoni- 

cally decreasing function of to . Hence the maximum of E{z — r 1— r > 0) 
is reached for r = 0 and consequently 

2A 


(4.20) 


f = 


G(- 


[«(-») + »(-.)! - l!i[* + l^j]. 
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Now we shall calculate We have 
f' =» Min E{z + r|2+r<0) = 

( 4 - 21 ) ' ' / 

= —2A Max E{ —x — ^ ; 

r \ 2A 


-Max E{—z — r | —« — r >6) 


Let < = —X + ^ and = ^ + 0. Then 

^ -X - = E{t - to\t - U>0) 

(4-22) 

"*® * ■ 

Since this is a monotonically decreasing function of to , we have 
(4.23) 

From (4.21) and (4.23) we obtain 


!<?(«) J- 


4.3. Saving in the number of observations as compared with the carrent test 
procedure. We consider the case of a normally distributed variate, such that 




{Bi 7^ do)* 


Denote by n(a, ff) the minimum number of observations necessary in the current 
most powerful test for the probabilities of errors of the first and second kinds 
to be a and respectively, or less. 

We shall calculate the number of observations required by the most powerful 
test. It can be assumed without loss of generality that do < di. According 
to the current most powerful test procedure the hypothesis Ho is accepted if 
X < d and the hypothesis Hi is accepted if x > d, where Jr is the arithmetic 
mean of the observations and d is a propierly chosen constant. The probability 
of an error of the first kind is given by G{\/ n(d — do)] and the probability of an 
error of the second kind is given by 1 — G{y/n{d — di)] where G(t) = 

J dx. To equate these probabilities to a and /?, respectively, the 


quantities d and n must satisfy 


G[\/n(d - do)] = a 
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and 

(4.26) 1 - G[y/^{d - = ^. 

Denote by Xo and Xi the values for which G(Xo) = a and G'(Xi) = 1-/3. Then 
we have 

(4.27) V^(d - flo) = Xo 
and 

(4.28) \/n(d - «,) = Xi. 

Subtracting (4.27) from (4.28) we obtain 

(4.29) V^(eo - fli) = Xi - Xo. 

From (4.29) 

(4.30) n=n(a,/J) ~ 


If the expression on the right hand side of (4.30) is not an integer, n(a, 0) is the 
smallest integer in excess. 

In the sequential probability ratio test we put A = a{a, 0) = - and 

d 

B = 6(a, 0) = - -. Then the probability of an error of the first (second) 

1 — a 

kind cannot exceed a{0) except by a negligible amount. Let A{a, 0) and 
B{a, 0) be the values of A and B for which the probabilities of errors of the first 
and second kinds become exactly equal to a and 0, respectively. It has been 
shown in Section 3.2 that A{a, 0) < a{qLy 0) and B{ay 0) > 6 (a, 0). Thus, the 
expected values Ei{n) and Eo{n) are only increased by putting A = a {ay 0) and 
B = h {a, 0) instead oi A = A {a, 0) and B — B {ay 0), 

Consider the case \vhere [ — 0 o 1 is small so that the quantities $ and f' can 
be neglected. Thus, we shall use the approximation (4.8). Since y = aii H = 
Ho and 7 = 1 — if // = //i, we obtain from (4.8) 


(4.31) 
and 

(4.32) 


E,{n) 


a* _ + 

E;{z) ^ Eiiz) 


Eo{n) 


-b* 

Eo{-z) 


-h* + a* 
"" Eo{-'Z) 


where o* == log o(a, 0) = log 


^ and h* = log i»(a, /S) 



a 


Since 



and 

(4.34) 


imSTS- ■ '" "f'i-ifr;- 


E,{-z) = m - Oi)*, 


* ^ jgr /^\ 

it follows from (4.30), (4.31) and (4.32) that and " 7 “; are independent 

n(a, 0) n(a, /3) 

of the parameters So and di . 


TABLE 1 


Average percentage saving of sequential analysisy as compared with current most 
powerful test for testing mean of a normally distributed variate 
A. When alternative hypothesis is true: 


|9 

.01 

.02 

i 

.03 

.04 

.05 

.01 

58 

60 

61 

62 

63 

.02 

54 

56 

57 

58 

59 

.03 

51 

53 

54 

55 

55 

.04 

49 

50 

51 

52 

53 

.05 

47 

49 

50 

50 

51 


B, When null hypothesis is true: 
















.01 

.02 

.03 

.04 

.05 







.01 

58 

54 

51 

49 

47 

.02 

60 

56 

53 

50 

49 

.03 

61 

57 

54 

51 

50 

.04 

62 

58 ^ 

55 

52 

50 

.05 

63 

59 

55 

53 

51 


The average saving of the sequential analysis as compared with the current 

method is 100 ^ P®' 

' ’ . / Ei(,n)\. , ’ , 

cent if Ho is true. In Table 1 the expression 10011 - 1 is showm m Panel 

( Eoiji) \ 

1 - in Panel B, for several values of a. and (3. 

Because of the symmetry of the normal distribution, Panel B is obtained from 
Panel A simply by mterchanging a and 0. 
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As can be seen from the table, for the range of a and & from *01 to .05 (the 
range most frequently employed), the sequential process leads to an average 
saving of at least 47 per cent in the necessary number of observations as com¬ 
pared with the current procedure. The true saving is slightly greater than shown 
in the table, since Ei{ri) calculated under the condition that A = a (a, jS) and 
B ^ h {a, $) is greater than Ei{n) calculated under the condition that A = A 
(a, fi) and B ^ B {a, jS). 

4.4. The characteristic function, the moments and the distribution of the number 
of observations necessary for reaching a decision. It was shown in [4] (see equa¬ 
tion (15) in [4]) that the following fundamental identity holds 

(4.35) E{e^^y(t)r] - 1 Mt) = Ee^^) 

for all points t of the complex plane for which ^(0 exists and | (p(t) | > 1. The 
symbol n denotes the number of observations required by the sequential test, 
i.e., n is the smallest positive integer for which Zn is either > log A or < log B, 
and ^(0 denotes the moment generating function of z. 

On the basis of the identity (4.35) the exact characteristic function of n is 
derived iA section 7 of [4] in the case when z can take only integral multiples of 
a constant. If the number of different values which Zn can take is large, the 
calculation of the exact characteristic function is cumbersome, because a large 
number of simultaneous linear equations have to be solved. However, \i\ Ez\ 
and cr, are small so that | Zn — log A | (when Zn > log A) and j Zn — log B | 
(when Zn < log B) can be neglected, the calculation of the characteristic func¬ 
tion is much simpler, as was shown in [4]. We shall briefly state the results 
obtained in [4]. Let h be the real value 5 ^ 0 for which (p{h) = 1 . Furthermore 
let t = tiir) and t = ^>(t) be the roots of the equation in t 

—log ip{t) == T 

such that lim hir) = 0 and lim / 2 (r) = h. Finally, let ^i(r) the charactcr- 

T-O T-O 

istic function of the conditional distribution of n under the restriction that Zn > 
log A, and ^ 2 (t) the characteristic function of the conditional distribution of n 
under the restriction that Zn < log B. Then, if j — log A | (when > 
log A) and | Z„ — log B | (when Zn < log B) can be neglected, ^i(t) and^ 2 ( 1 ”) are 
the solutions of the linear equations 

(4.36) + (1 - y)Ur)B‘'^^^ = 1 
and 

(4.37) 7 i('i(r)A‘*<'> + (1 - = 1 

where 

1 - B^ 

y == PiZn > log A) == Jh-^r^h . 

The characteristic function of the unconditional distribution of n is 

(4.38) = yrpiir) + (1 — 7 )^ 2 (r). 
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As an illustration we shall determine ^i(r), Mr) and ^(r) when 2 ; has a normal 
distribution. Then we have 


Hence 

(4.39) 

(4.40) 


log^(0 « 


, 2Ez 

/i = — 2 

< i ( t ) =® ”2 {-^Ez + (Ez)^ — 2al t )» 

tiir) = 4 (-Ez - ViEzf - 2<r“T). 


From (4.36), (4.37) and (4.38) we obtain 

g02 _ J5»l 


(4.41) 

(4.42) 
and 

(4.43) 
where 

(4.44) 
and 

(4.45) 


7^i(t) 


(1 — y)h(T) = 




^( 7 ) 


4. ~ 


fill = 2 + “s/{Ez)“ — 2(r* r) 




-1 {-Ez - V(^’s)2 - 2(7?^). 


For any jiositive integer r the r-th moment of 71 i.e.j E(n) is equal to the r-th 
derivative of ^(t) taken at r = 0. Let E*(7i) be the conditional expected value 
of n under the restriction that Zn < log B, and let J&**(n’') be the conditional 
expected value of n under the restriction that Zn > log A . Then 


(4.46) 


E*(n) = 


d^Ur) 

dr- 


and E'^^iri) = 


d'yhir) 

dr^ 


7 r . / \ 

It may be of interest to note that * ^ (A; = 1, 2) and therefore also the 

07 *^ T-O 

moments of n can be obtained from the identity (4.35) directly by successive 
differentiation. In fact, the identity (4.35) can be written as (neglecting the 
excess of Z„ over the boundaries log A and log B) 

(4.47) 7AVi[-logv>(0] + (1 - 7)^V2[—log^(01 = 1. 



150 


A. WALD 


Taking the first r derivatives of (4.47) with respect to < at f 0 and t 
we obtain a system of 2r linear equations in the 2r unknowns (k 


dr^ 

1, • • • , r) from which these unknowns can be determined. For example, 
ik = 1, 2) can be determined as follows: Taking the first derivative 


l,2;i 

I 

dr r-o 

/ ( \ 

of (4.47) with respect to t and denoting 3- - by we obtain 


7(log A)AVi[-log - yA‘ i/'|"[-log «?(0] 


(4.48) 


+ (1 - 7)(log B)J5V8l—log (p>(<)] 
- (1 - 7)B' ^ i^^‘'[-log v(f)] 


Putting t = 0 and t = h we obtain the equations 
(4.49) 7 log A - 7 ^1' 


'^’(0) + (1 - 7) log - (I - 7) l^2"(0) = 0 


and 


(4.50) 


7(log A)A^ - 7A 




<pQi) 

+ (1 - 7)(log B)B^ - (1 




0 


from which ^i‘’(0) and can be determined. 

The distribution of n can be obtained by inverting the characteristic function 
of i/'Cr). This was done in [4] (neglecting the excess of Zn over log A and log B) 
in the case when z is normally distributed. The results obtained in [4] can be 
briefly stated as follows: If B = 0, or if B >0 and A = qo, the distribution 
of n is a simple elementary function. If B = 0 and Ez > 0, the distribution of 

m = ^2 (Ez^n is given by 


(4.51) F(m) dm = g-c*/4..^+c (0 < m < «) 

where 

(4.52) c = A (Ez) log A. 

O'* 

If B > 0, A = 00 and Ez <0 the distribution of m = (Ezfn is given by the 

Z<Tg 

expression we obtain from (4.51) if we substitute ”5 (Ez) log B for c. 

O’* 

If B > 0 and A < oo, the distribution of m is given by an infinite series where 
each term is of the form (4.51) (see equation (76) in [4]). 
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Since m is a discrete variable, it may seem paradoxical that we obtained a 
probability density function for m. However, the explanation lies in the 
that we neglected the excess of Zn over log A and log B which is zero only in ibe 
limiting case when Ez and approach zero. 

The distribution of m given in (4.51) can be used as a good approximation 
to the exact distribution of m even if B > 0, provided that the probability that 
> log A is nearly equal to 1. 

It was pointed out in [4] that if | | and o’, are sufficiently small, the distribu¬ 

tion of n determined under the assumption that z is normally distributed will 
be a good approximation to the exact distribution of n even if z is not normally 
distributed. 

4,5. Lower limit of the 'probability thai the sequential process will terminate with 
a number of trials Uss than or equal to a given number. Let Pi{no) be the prob¬ 
ability that the sequential process will terminate at a value n < no, calcidated 
under Hi (i = 0, 1). Let 


= Po [S 2. < log b] 
Pi(no) = Pi E Za > log A J . 


It is clear that 


P,(no) < P,(no) 


(t = 0, 1). 


For calculating Pi(rao) we shall assume that no is sufficiently large so that 2^ z. 

a^l 

can be regarded as normally distributed. Let G(X) be defined by 




e“*'’ dt. 


Furthermore, let 


= jog A — noEijz) 
Vno 0-1(2) 


(4^) ^ - "-f-W 

■Vno(ro{z) 

where <Ti(z) is the standard deviation of z under Hi. Then 

(4.59) PiCno) = (?[Xi(no)] 
and 

(4.60) PoC^^) = 1 — (j[Xo(no)]. 

Hence we have the inequalities 

(4.61) Pi(no) > GlUno)] 
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and 

(4.62) Poino) > I - G[\o(no)]. 

1 — B 

Putting log A = log .—— and log B = log Y * Table 2 shows the values 

of Pi(no) and Po(wo) corresponding to different pairs (a, jS) and different values 
of no. In these calculations it has been assumed that the distribution under 
^0 is a normal distribution wdth mean zero and unit variance, and the distribution 
under Hi is a normal distribution with mean 6 and unit variance. For each pair 
(a, /3) the value of 6 was determined so that the number of observations required 
by the current most powerful test of strength (a, jS) is equal to 1000. 

TABLE 2 


Lower hound of the probability* that a sequential analysis will terminate within 
various numbers of trials^ when the most powerful current 
test requires exactly 1000 trials 


Number of 
trials 

! 

a .01 and /3 « .01 

a ■■ .01 and ** .05 

a «■ .05 and 3 ** .05 

Alternative 

hypothesis 

true 

Null 

hypothesis 

true 

Alternative 

hypothesis 

true 

Null 

hjqjothesis 

true 

Alternative 

hypothesis 

true 

Null 

hypothesis 

true 

1000 

.910 

.910 

.799 

.891 

.773 

.773 

1200 

.950 

.950 

.871 

.932 

.837 

.837 

1400 

.972 

.972 

.916 

.957 

.883 

.883 

1600 

.985 

.985 

.946 1 

.972 

.915 

.915 

1800 

.991 

.991 

.965 i 

.982 

.938 

.938 

2000 

.995 

.995 

.977 

.989 

.955 

.955 

2200 

.997 

.997 

.985 

.993 

.967 

.967 

2400 

.999 

.999 

.990 

.995 

.976 

.976 

2600 

.999 

.999 

.994 

.997 

.982 

.982 

2800 

1.00 

1.00 

.996 

.998 

.987 

.987 

3000 

1.00 

1.00 

.997 

.999 

.990 

.990 


* The probabilities given are lower bounds for the true probabilities. They 
relate to a test of the mean of a normally distributed variate, the difference be¬ 
tween the null and alternative hypothesis being adjusted for each pair of values 
of a and 0 so that the number of trials required under the most po^verful current 
test is exactly 1000. 


4.6. Truncated sequential analysis. In some applications a definite upper 
bound for the number of observations may he desirable. Thus, a certain 
integer no is chosen so that if the sequential process does not lead to a final 
decision for n < no, a new rule Ls given for the acceptance or rejection of i?o 
at the stage n = no. 

A simple and reasonable rule for the acceptance or rejection of Ha at the stage 

no nn 

n ^ no can be given as follows; If 2^ < 0 we accept Ho and if 7. go. > 0 





m 

we accept Hi . By thus truncating the sequential process we change, however, 
the probabilities of errors of the first and second kinds. Let a and be the 
probabilities of errors of the first and second kinds, respectively, if the sequential 
test is not truncated. Let a(no) and fi(no) be the probabilities of errors of the 
first and second kinds if the test is truncated at n = n©. We shall derive upper 
bounds for a(wo) and j3(wo). 

First we shall derive an upper bound for a(no). Let po(no) be the probability 
(under the null hypothesis) that the following three conditions are simultaneously 
fulfilled*: 

n 

(i) log B < 2^ Za < log .4 forn = 1, • • •, no — 1 

a««l 

(ii) 0 < 2^ < log 4 

a.l 

(iii) continuing the sequential process beyond no, it terminates with the 
acceptance of Ho ^ 

It is clear that 

(4.63) «(no) a + po(no). 

Let po(no) be the probability (under the null hypothesis) that 0 < 2^ z« < 
log A, Then obviously 

Po(no) < ^(no) 

and consequently 

(4.64) ot(no) ^ Of + j5o(no). 

I^et pi(no) be the probability under the alternative hypothesis that the fol¬ 
lowing three conditions are simultaneously fulfilled: 

(i) log B < 2!) Sa < log .4 for n = 1, • • •, Tio — 1 


no 

(ii) log B < 2^ 2 a < 0 

(iii) continuing the sequential process beyond no, it terminates with the 
acceptance of Hi . 

It is clear that 

( 4 . 65 ) i 3 (no) < ^ + Pi(no). 

Let pi(no) be the probability (under the alternative hypothesis) that log B < 
2^ 2« < 0. Then pi(no) < Pi(no) and consequently 

a-»l 


(4.66) 


|8(’^o) < /3 + Pi(no). 
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Let 

_ — T^igo(g) 
\/noaQ(z) 

log A — rioEoiz) — ?ioJgi(g) 

\^(ro(z) * Vnoo’iCf)* 

where <r,(g) is the]^standard deviation of z under Hi (i 
(4.67) po(no) = (7(vi) - GC^j) 

and 


log B — tipEiiz) 
VnooriCg) 

0, 1). Then 


(4.68) Pi(no) = G{va) — ^(vs). 

From (4.64), (4.66), (4.67) and (4.68) we obtain 

(4.69) a(«o) < a + G(j^i) — G(v 2 ) 
and 

(4.70) ^(no) <0 + GM - G(v3). 

The upper bounds given in (4.69) and (4.70) may considerably exceed a(no) 
and i8(no), respectively. It would be desirable to find closer limits. 

Table 3 shows the values of the upper bounds of a(no) and 0{no) given by for¬ 
mulas (4.69) and (4.70) corresponding to different pairs (a, 0) and different values 

1 — d 0 

of no. In these calculations we have put log A = log-log B = log -- 

a 1 — a 

and assumed that the distribution under Ho is a normal distribution with mean 
zero and unit variance, and the distribution under Hi is a normal distribution 
with mean $ and unit variance. For each pair (a, 0) the value of 6 has been 
determined so that the number of observations required by the current most 
powerful test of strength (a, 0) is equal to 1000. 

It seems to the author that the upper limits given in (4.69) and (4.70) are 
considerably above the true a(no) and 0{no) respectively, when no is not much 
higher than the value of n needed for the current most powerful test. 

4.7. Efficiency of the sequential probability ratio test. Let S be any sequen¬ 
tial test for which the probability of an error of the first kind is a, the prob¬ 
ability of an error of the second kind is 0 and the probability that the test 
procedure will eventually terminate is one. Let S' be the sequential prob¬ 
ability ratio test whose strength is equal to that of *<8. We shall prove that the 
sequential probability ratio test is an optimum test, i.e., that Ft(n | S) > 
Ei{n I S') (f = 0, 1), if for S' the excess of Zn over log A and log B can be neg¬ 
lected. This excess is exactly zero if z can take only the values d and*.—d 
and if log A and log B are integral multiples of d. In any other case the excess 
will not be identically zero. However, if \Ez\ and a, are sufficiently small, 
the excess of Zn over log A and log B is negligible. 

For any random variable u we shall denote by E'^iu ] S) the conditional 
expected value of u under the hypothesis (i = 0,1) and under the restriction 
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that Hq is accepted. Similarly, let E**iu j <S) Jbe the conditioiial expected vdue 
of u under the hypothesis Hi (i = 0, 1) and under the restriction that ITi is 
accepted. In the notations for these expected values the symbol 8 stands for 

TABLE 3 


Effect on riske of error of truncating* a sequential analysis at a predetermined^ 

number of trials 


Number of 
trials 

a » .01 and 3 *■ .01 ^ 

a .01 and $ .05 

a .05 and .05 

Upper 
bound of 
effective 
a 

Upper 
bound of 
effective 
& 

Upper 
bound of 
effective 

ct 

Upper 
bound of 
effective 

Upper 
bound of 
effective 
a 

Upper 
bound of 
effective 

1000 

.020 

.020 

.033 

.070 

.095 

.095 

1200 

.015 

.015 

.024 

.063 

.082 

.082 

1400 

.013 

.013 

.019 

.058 

.072 

.072 

1600 

.012 

.012 

.016 

.055 

.066 

.066 

1800 

.011 

.011 

.014 

.053 

.062 

.062 

2000 

.010 

.010 

.012 

.052 

.058 

.058 

2200 

.010 

.010 

.012 

.051 

.056 

.056 

2400 

.010 

.010 

.011 

.051 

.055 

.055 

2600 

.010 

.010 

.011 

.051 

.053 

.053 

2800 

.010 

.010 

.010 

.050 

.053 

.053 

3000 

.010 

.010 

.010 

.050 

.052 

.052 


* If the sequential analysis is based on the values a and 0 sho^^m, but a deci¬ 
sion is made at no trials even when the normal sequential criteria would require 
a continuation of the process, the realized values of a and 0 will not exceed the 
tabular entries. The table relates to a test of the mean of a normally distributed 
variate, the difference between the null and alternative hypotheses being ad¬ 
justed for each pair (a,/3) so that the number of trials required by the current 
test is 1000. 

the sequential test used. Denote by Qi{S) the totality of all samples for which 
the test S leads to the acceptance of . Then we have 


(4.71) 

Et 


Pi[Qo(S)] _ 

" Po[<2o(S)] 1 

0 

— a 

(4.72) 

Et* 

fei*)' 

Piie>(S)] 1 
■ PolQx(5)] 

- 3 

a 

(4.73) 

Et 

(v^,.\ _Poms)] _ 1 
/ Piia.(S)] 

— a 

0 

and 





(4.74) 

Et* 


PolQi(S)] _ 
Piie»(-s)] 

a 
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To prove the efficiency of the sequential probability ratio test, we shall hrst 
derive two lenunas. 

Lemma 1. For any random variable u the inequality 

(4.75) < Ee^ 

"holds. 

Proof: Inequality (4.75) can be written as 

(4.76) 1 < 

where u' ^ u ^ Eu, Lemma 1 is proved if we show that (4.76) holds for any 
random variable u' with zero mean. Expanding e’*' in a Taylor series around 
u' == 0, we obtain 

(4.77) e“' = 1 + m' + itt'V'’''* where 0 < {(«') < «'• 
Hence 

(4.78) £<;“' = 1 + > 1 


and Lemma 1 is proved. 

Lemma 2. Let She a sequential test such that there exists a finite integer N with 
the property that the number n of observations required for the test is < N. Then 

<4.7« E,(n|«) 


The proof is omitted, since it is essentially the same as that of equation (4.5) 
for the sequential probability ratio test. 

On the basis of Lemmas 1 and 2 we shall be able to derive the following 
theorem. 

Theorem. Let S be any sequential test for which the probability of an error 
of the first kind is a, the probability of an error of the secmid kind is and the prob¬ 
ability that the test procedure will eventually terminate is equal to one. Then 

am E.(n 1S) > [(1 - .) log j-t + „ log L^] 

and 


(4.81) E^(n I ^ log j— ^ + (1 - ^) log . 

Proof: First we shall prove the theorem in the case when there exists a finite 
integer N such that n never exceeds N. According to Lemma 2 we have 


E,{n IS) 

(4.82) 
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uid 


(4.83) 




^[«Er(i<.g£^|5) + a-«Br’(ic|=|«);. 


Ei{z) 


From equations (4.7l)-(4.74) and Lemma 1 we obtain the inequalities 


(4.84) 

(4.85) 

(4.86) 
and 

(4.87) 


Et 


F,**(logHl=|s) < log^ “ 
\ Pon I / a 


/3 

1 — a 

- » 




Since Eo{z) < 0, (4.80) follows from (4.82), (4.84) and (4.85). Similarly, since 
Ei{z) > 0, (4.81) follows from (4.83), (4.86) and (4.87). This proves the theo¬ 
rem when there exists a finite integer N such that n < N, 

To prove the theorem for any sequential test S of strength (a, j8), for any 
positive integer N let Sn be the sequential test we obtain by truncating S at the 
iV-th observation if no decision is reached before the AT-th observation. Let 
(ajvr , 0 n ) be the strength of Sy . Then we have 

(4.88) Eo{n I S) > E,(n \ S^) > -^ [(I - log log 

tiio\Z) L t — <Xm as J 

and 


(4.89) Fi(n 1 S) > Ex{n \ Ss) > 



g.V 

1 — OfiV 


+ (1 - M log 


1 - gAT 
as 


Since lim ajv = a and lim ^s = /3, inequalities (4.80) and (4.81) follow from 

.V— 

(4.88) and (4.89). Hence the proof of the theorem is completed. 

If for the sequential probability ratio test S' the excess of the cumulative sum 
Zn over the boundaries log A and log B is zero, E^{n | S') is exactly equal to the 
right hand side member of (4.80) and Ei{n | S') is exactly equal to the right hand 
side member of (4.81). Hence, in this case S' is exactly an optimum test. 
If both I Ez I and cr* are small, also the expected value of the excess over the 
boundaries will be small and, therefore, £?o(n 1 S') and Ex{n [ S') Avill be only 
slightly larger than the right hand members of (4.80) and (4.81), respectively. 
Thus, in such a case the sequential probability ratio test is, if not exactly, very 
nearly an optimum test.^^ 


Th« author conjectures that the sequential probability ratio test is exactly an opti¬ 
mum test even if the excess of Zn over the boundaries is not zero. However, he did not 
succeed in proving this. 
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Part IL Sequential Test of a Simple or Composite Hypothesis Aoainst 

A Set of Alternatives 

In Part I we have dealt with the problem of testing a simple h3rpothesis Ho 
against a single alternative Hi . Here we shall consider the problem of testing 
a simple or composite hypothesis against a set of infinitely many alternatives. 
By a simple hypothesis we mean a hypothesis which specifies uniquely the 
probability distribution of the random variable x under consideration. A 
hypothesis is called composite, if it is not simple. 

5« Test of a Simple Hypothesis Against One-sided Alternatives 

5.1. General remarks. Let /(x, B) be the probability density fimction of a 
random variable X, where ^ is an unknown parameter. Suppose that it is re¬ 
quired to test the simple hypothesis that 0 = do and that the alternative values 
of B are restricted to values d > do. Assume that it is desired to have a sequen¬ 
tial test such that the probability of an error of the first kind is equal to a given a. 

The probability of an error of the second kind is no longer a single value, but 
is a function of the true value of d. If /(x, d) is a continuous function of x and 
d, the probability of an error of the second kind will be arbitrarily near 1 — a 
if the true value of d is sufficiently near do. Hence, if a is small, the prob¬ 
ability of an error of the second kind is necessarily large when the true value of d 
is very near do. In most practical applications we do not care if the prob¬ 
ability of an error of the second kind is high when the true value of d is very 
near do, since in this case the error committed by accepting do is usually of very 
little importance. However, there will be a value di > do such that we wish the 
probability of an error of the second kind to be less than or equal to a given small 
positive value /3 whenever the true value of d is greater than or equal to di. 

In this case we can proceed as follows: Consider the single alternative hypothe¬ 
sis Hi that d = di. Construct a sequential test for testing d = do against the 
single alternative Hi such that the probability of an error of the first kind is a 
and the probability of an error of the second kind, i.e., the probability of ac¬ 
cepting do when di is true, is /3. If this sequential test has the further property 
that the probability of an error of the second kind is less than or equal to 
whenever the true value of d is greater than di, then this sequential test pro¬ 
vides a satisfactory solution of the problem of testing the hypothesis that d = do 
against the set of alternatives d > do. 

In most of the important cases occurring in practice, such as when X has a 
normal, binomial, or Poisson distribution, etc., the sequential probability ratio 
test for testing the hypothesis that d = do against a single alternative di (di > do) 
satisfies the condition that the probability of an error of the second kind is a 
monotonically decreasing function of d in the domain d > do. Thus, in all these 
cases the sequential probability ratio test for testing the hypothesis that d = do 
against a properly chosen alternative di provides a satisfactory solution of our 
problem. 
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case in which the alternative values of 6 are restricted to values less than 
$0 is entirely analogous to that in which the alternatives are restricted to vali:^ 
greater than (?o, and need not be discussed separately. 

It should be pointed out that the test procedure for testing B ^ Ba fqa^amst 
alternatives $ > So, as described in this section, is also suitable for testing the 
composite hypothesis that 6 < Bo ^ provided that the probability of rejecting 
the null hypothesis is < a whenever the true value of 0 is < do. This condi¬ 
tion is fulfilled, for instance, when X has a normal, binomial or Poisson distribur 
tion. 

6.2. ApplicaMon to binomial distributions. 6.2.1. Statement of the problem. 
The case of a binomial distribution arises when the result of a single observa¬ 
tion is a classification into one of two categories. For example, this is the 
situation in acceptance inspection of manufactured products, if each unit 
inspected is classified into one of the two categories, non-defective and defective. 
Let p denote the probability that an item belongs to a given category. The 
value of p is usually unknown. We shall deal here with the problem of testing 
the hypothesis that p does not exceed a given value p' against the alternative 
possibility that p > p\ 

Since acceptance inspection of manufactured products is perhaps the most 
important and widest field of application of such a test procedure, we shall, in 
continuing the discussion, use the terminology of acceptance inspection. This, 
of course, does not mean that the test procedure is not applicable to other 
cases. Suppose that a lot containing a large number of units is submitted for 
sampling inspection. Let p denote the proportion of defective units contained 
in the lot. The probability that a unit drawn at random from the lot will be 
defective is equal to p. If m units are drawn at random from the lot, the prob¬ 
ability that there l>e d defectives among them is given by^* 


(5.1) 


ml 

dl(m - d)l 


p‘‘(i - p)”-^ 


(d =0,1, ••• ,m). 


The probability distribution as given in (6.1) is called a binomial distribution. 

The purpose of sampling inspection is to decide whether the lot should be 
accepted or rejected. It is clear that for high values of p we want to reject the 
lot and for low values of p we want to accept the lot. Thus, it will be possible 
to specify a particular value of p, say p', so that if p < p' we wish to accept the 
lot, and if p > p' we wish to leject the lot. Thus, our problem is to devise a 
proper sampling inspection plan for testing the hypothesis that p < p'. 

6.2.2. ToUrated risks for making a wrong decision. No sampling inspection 
plan can guarantee that the correct decision will always be made, i.e., that the 
lot ^vill always be accepted when p < p' and the lot will always be rejected when 
p > p\ unless the lot is inspected completely. A complete inspection is usually 

“ Formula (6.1) is exact only if the lot contains infinitely many units. While the lot is 
always finite in practice, we shall assume that m is small as compared with the lot size so 
that formula (6.1) can be used. 
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rather uneconomical and one is willing to take some risk of making a wroisg 
decision if this permits a reduction in the amount of inspection. Hence, recom¬ 
mendations as to the proper choice of a sampling inspection plan can be made 
only after the risks that can be tolerated have been stated. 

If p is equal to the marginal value we may say that it is indifferent to ub 
whether the lot is accepted or rejected. If p < p' we prefer acceptance and 
this preference is the stronger the smaller p. Similarly, if p > p' we prefer 
rejection of the lot and this preference increases as p increases. Thus, it will 
be possible to select a value po < p' and a value pi > p' such that the error is 
considered serious only if we accept the lot when p > pi, or we reject the lot 
when p < Po. 

After the two values po and pi have been selected the risks that we are willing 
to tolerate may reasonably be stated as follows: a sampling inspection plan is 
required such that the probability of rejecting the lot is less than or equal to a 
preassigned value a whenever p < and the probability of accepting the lot 
is less than or equal to a preassigned value j8 whenever p > Pi. Thus, the 
tolerated risks are characterized by the four quantities po, pi, a and j9. The 
proper sampling plan can be determined after these four quantities have been 
chosen. 

6.2.3. The sequential probability ratio test corresponding to the quantities po, 
Pi , a and /?. Let Ho be the hypothesis that p = po and Hi the hypothesis that 
jp ^ Pi , Consider the sequential probability ratio test T for testing Ho against 
Hi for which a is the probability of accepting Hi when Ho is true (error of the 
first kind) and p is the probability of accepting Ho when Hi is true (error of the 
second kind). This probability ratio test will satisfy all our requirements, since 
for this test the probability of accepting the lot (accepting Ho) is <i8 whenever 
p > Pi and the probability of rejecting the lot (accepting Hi) is <a whenever 
P < Po- 

According to formulas (3.8), (3.9), (3.10) and section 3.3 the sequential test 
T is given as follows: At each stage of the inspection, at the m-th observation 
for each integral value of m, calculate the quantity 


(5.2) 


Plm _ pHI - 


(m = 1, 2, • • • ) 


where dm denotes the number of defectives found in the first m units inspected. 
Reject the lot (accept Hi) if 


(5.3) 


Plm ^ 1 - P 

POm Oi 


Accept the lot if 
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T^e an additional observation if“ . 


1 - 


1 — a 


For the purpose of practical computations it is useful to rewrite the inequalities 
(5.3), (5.4) and (5.5) in a somewhat different form. Taking the logarithms of 
both sides of the mequalities (5.3), (5.4) and (5.5) one can easily verify that 
these mequalities are equivalent to 


1-/3 


log ^ - log Y 

Po 1 


1 - Pi_ 

1 1 — Pi 

1 — Po 


Po 1 — Po 


log — — log 


1 - Po 

1 - pi 

_ - p» 


log ^ - log I- ^ 

Po 1 — Po 


1 — Po 
1 - Pi 
, 1 - 


log ^ - log 


<d„< 


1 - iS 


log - log j- 

Po 1 — Po 


log ^ - log 


1 - Po 

1 - Pi 

, 1 — ©1 

- logi- - 

1 — Po 


Using the inequalities (5.6), (5.7) and (5.8) the test procedure can easily be 
carried out as follows: For each m we compute the acceptance number 


log ^ - log j-1- 

Po 1 — Po 


log — - log 


1 - Po 
1 Pi 

1 1 — Pi 

■ —r 

1 — po 


and the rejection number 


log ^-- log - _ ^ 

(5.10) R„ = - ° ■ _ - + ^ -““ 

log ^ - log \ - ^ log ^ - log ^^ 

Po 1 — po Po 1 — Po 

» There is a slight approximation involved in the formulas (5.3), (6.4) and (6.6). For 
details see section 3.3. 
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These acceptance numbers -4 m and rejection numbers Rm are best tabulated 
before inspection starts. Inspection is continued as long bja Am < dm < Rm* 
At the first time when dm does not lie between the acceptance and rejection 
numbers, the sampling inspection is terminated. The lot is accepted if dm < Am 
and the lot is rejected if dm > Rm * 

The test procedure can also be carried out graphically as indicated in Figure 2. 
The number m of observations made is measured along the abscissa axis. Since 
Am is a linear function of w, the points (w, Am) will lie on a straight line Lo. 
Similarly, the points (m, Rm) will lie on a straight line Li. We draw the lines 
Lo and Li and the points (m, dm) are plotted as inspection goes on. At the first 
time when the point (m, dm) does not lie between the lines Lo and Li inspection 



is terminated. The lot is rejected if the point (m, dm) lies on Li or above, and the 
lot is accepted if the point (m, dm) lies on Lo or below. 

5.2.4. The operating characteristic curve of the test. As mentioned in section 
5.2.3 the test procedure defined by the inequalities (5.6), (5.7) and (5.8) will 
satisfy the requirement that the probability of accepting the lot is < jS when¬ 
ever p > p\ and the probability of rejecting the lot is < a whenever p < po • 
Although this already describes the essential features of the test procedure, it 
may be desirable to know the probability Lp of accepting the lot for any possible 
value p of the proportion of defectives in the lot. Clearly, Lp will be a function 
of p and can be plotted as shown in Figure 3. The curve Lp is called the operat¬ 
ing characteristic curve. The range of p is, of course, from 0 to 1. Lp = 1 
for p = 0 and Lp = 0 for p = 1. The value of Lp decreases as p increases. 
We already know that Lp^ = 1 — a and Lp^ = /8. Now we shall give a method 
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for computing the value of Lp for any p. If pi is not far from po j whiA will 
usually be the case in practice, a good approximation to Lp is given by {sw 
equation 3.35) 


(5.11) 


1 ~ 






where h is equal to the non-zero root of the equation 

(5.12) 



To plot the operating characteristic curve, it is not necessary to solve (5.12) 
with respect to h. Instead we can proceed as follows: From (5.12) we express 
p as a function of /i, i.e., 


(5.13) 



For any given value h we compute the value of p from (5.13) and the value of 
Lp from (5.11). The point (p, Lp) obtained in this way will be a point of the 
operating characteristic curve. Doing this for various values of h we can 
obtain a sufficient number of points on the operating characteristic curve so 
that the curve can be drawn. 
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6.2,5. The average amount of inspection required hy the test Denote by Ep(n) 
the expected value of the number of ob^rvations required by the test. Clearly, 
Ep(n) is a function of p. According to (4.8) a good approximation to the value 
of Ep{n) is given by 


(6.14) 


log r-^-1- (1 - Lp) log - - 

Ep(n) ~ ^^ 

p log^‘ + (1 - p) log - - ^ 

Po 1 — Po 


where Lp is given by (5.11). Plotting Ep(n) as a function of p, the curve obtained 
will, in general, be of the type shown in Fig. 4. The maximum will ordinarily 
be reached between po and pi. Furthermore, the curve will, in general, be 
increasing as p increases from 0 to po, and decreasing as p increases from pi 
to 1. 



5.3. Sequential analysis of double dichotomies, 5.3.1. Formulation of the 
problem. Suppose that we want to compare the effectiveness of tw’o production 
processes where the effectiveness of a production process is measured in terms 
of the proportion of effective units in the sequence produced. We shall say that 
a unit is effective if it has a certain desirable property, for example, if it with¬ 
stands a certain strain. Let pi be the proportion of effectives if process 1 is 
used, and p2 the proportion of effectives if process 2 is used. In other words, 
Pi is the probability that a unit produced will be effective if process 1 is used, 
and p2 is the probability that a unit produced will be effective if process 2 is 
used. Suppose that the manufacturer does not know the values of pi and p2, 
and that process 1 is in operation. If Pi > P2, then the manufacturer wants to 
retain process 1. However, if pi < P2, especially if pi is substantially smaller 
than P2, the manufacturer would like to replace process 1 by process 2. Thus, 
we are interested in testing the hypothesis that pi > P2 against the alternative 
that Pi < P2. 

A more general formulation of the problem can be given as follows: Consider 
two binomial distributions. Let pi be the probability of a success in a single 
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trial according to the first binomial distribution, and let pj be the probabUity 
of a success in a single trial according to the second binomial distributicm. 
We shall use the symbol 1 for success and the symbol 0 for failure. Suppose 
that the probabilities pi and p2 are unknown. We consider the. problem of test¬ 
ing the hypothesis that pi > pi on the basis of a sample consisting of Ni observa¬ 
tions from the first binomial distribution and Nz observations from the second 
binomial population. Since in many experiments the case iSTi = iV'2 is mainly 
of interest, and since this case (as we shall see later) makes an exact and sim¬ 
plified mathematical treatment of the problem possible, we shall assume in what 
follows that Ni Nz = N (say). 


Thus, on the basis of the outcome of the two series of N independent trials 
we have to decide whether the hypothesis pi > pz should be accepted or rejected. 

5.3.2. The classical method. The classical solution of the problem for large N 
is given as follows: Let Si be the number of successes in the first set of N trials 
(drawn from the first binomial population), and let Sz be the number of suc¬ 
cesses in the second set of N' trials (drawn from the second binomial population). 


Denote by p and 1 


p by q. Then for large N the expression 


(5.15) 


Sz ~ Si 

y/2Npq 


Is normally distributed with zero mean and unit variance if pi = P2. Suppose 
that the level of significance we wish to choose is a. Let \a be the value for 
wluch the probability that a normal variate with zero mean and unit variance 
will exceed Xa is equal to a. (For example, if a = .05, X« = 1.64). Thus, if 
Pi = P2 , the probability that the expression (5.15) will exceed Xa is equal to a. 
If Pi > P2 , the probability that the expression (5.15) will exceed Xa is less than a. 
According to the classical method the hypothesis that pi > P2 is rejected if the 
observed value of (5.15) exceeds Xa . This method involves an approximation. 
The distribution of the expression (5.15) is not exactly normal even for large N. 
For small N this method cannot be used, since the distribution of (5.15) is far 
from normal. For small A, R. A. Fisher has proposed an exact method which, 
however, involves cumbersome calculations. In section 5.3.3. w’e shall suggest 
another method which is exact (does not involve any approximations) and is 
simple to apply as far as computations are concerned. The latter method has 
the further advantage of being suitable for sequential analysis to which existing 
methods are not readily adaptable. 

5.3.3. An exact method. Let ai, • • • , be the results in the first set of N 
trials, and hi, • • • , the results in the second set of N trials. These results are 
arranged in the order observed. Consider the sequence of N pairs 


(5.16) 


(Ul , ^l), • * * > > ^N)^ 


Let h be the number of pairs (1, 0) and tz the number of paii-s (0, 1) in this 
sequence. We consider only the pairs (0,1) and (1,0) and base the test on them. 
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Let a be the outcome of an observation from the first population, and h the 
outsome of an observation from the second population. The probability that 
(a, li) = (1, 0) is equal to pi(l — P2), and the probability that (a, b) » (0, 1) is 
equal to (1 — pi)p2. Hence, knowing that (a, b) is equal to one of the pairs 
(0,1) and (1,0), the (conditional) probability that it is equal to (0,1) is given by 


(5.17) 


^ (1 - Pl)P2 

^ Pl{l - P 2 ) + p2(l - Pi) ’ 


and the (conditional) probability that it is equal to (1, 0) is given by 


(5.18) 


1 _ B - P») 

^ Pl(l - P») + (1 - Pl)Pi' 


Hence, considering only the pairs (1, 0) and (0,1) the variate h is distributed like 
the number of successes in a sequence of £ = + ^2 independent trials, the prob¬ 

ability of a success in a single trial being equal to p. One can easily verify that 
p = i if Pi ~ P2 , p < i if Pi > P2 and p > i if pi < P2. Thus, the hypothesis 
to be tested, i.e., the hypothesis that pi > p2, is equivalent to the hypothesis 
that p < J. Thus, we can test the hypothesis that pi > p2 by testing the 
hypothesis that p < i on the basis of the observed value of <2 • Since the dis¬ 
tribution of h is the same as the distribution of the number of successes int ^ ti + 
t 2 independent trials {t is treated as a constant and the probability of a success 
in a single trial is equal to p), the test procedure can be carried out in the usual 
manner. If we want a level of significance a, a critical value T is chosen so that 
for p = J the probability that t 2 > T is equal to a. The hypothesis that p < J 
is rejected if and only if the observed (2 is greater than or equal to the critical 
value T, The value of T can be obtained from a table of the binomial distribu¬ 
tion. If t is large, (2 is nearly normally distributed and the critical value T can 
be obtained from a table of the normal distribution. 

This procedure thus provides a simple test of the hypothesis that Pi > P2. 
The question warises whether the efficiency of this method is as high as that of the 
classical method. It would seem that the method suggested here cannot be a 
most efficient procedure, since the values of ti and (2 depend on the order of the 
elements in the sequences (ui, • •« , a^) and (61, • • * , bjv), and there is no 
particular reason to arrange them in the order observed. However, it has been 
shown in [7] that the loss in efficiency as compared with the classical method is 
negligible if the number N of trials is large.^^ 

It should be pointed out that the procedure for testing the hypothesis that 
Pi > P 2 can be used also for testing the hypothesis that pi = p2 if the alternative 
hypotheses are restricted to p2 > Pi • 

In addition to simplicity and exactness the present method seems superior to 
the classical one in the following respect: Suppose that (contrary to the original 


** The author believes that the loss in efficiency is slight even when N is small, although 
no exact investigation of this case has been made. 
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assumption) the probability of a success varies from trial to trial. Denote by 
the probability of success in the f-th trial of the first set^ and by p2*^ the prob^ 
ability of success in the t-th trial in the second set {i = 1, • • • , N) . Assume that 
that the probabilities pi*^ and p?*^ are entirely unknown and we wish to test the 
hypothesis that pf^ — = ... = -- q. In this csse the classical 

method is not applicable, but the present method provides a correct procedure. 
Such a situation may arise, for instance, if we want to test the hypothesis that 
the probability of a success (hitting the target) is the same for two different guns. 
In the course of the experiments the probability of a hit may change due to ex¬ 
ternal conditions such as wind, disposition of the gunner, etc. However, these 
external conditions are likely to affect both guns equally if the trials are made 
alternately (or approximately alternately), so that if the tvro guns are equally 
good we have pi'^ = pi'^ (^ = i, ... ^ JV). 

5.3.4. Sequential test of the hypothesis that pi > P2. In order to devise a proper 
sequential test for testing the hypothesis that pi > P2, we have to state first 
what risks of making wrong decisions we are willing to tolerate. The efficiency 
of the production process 1 may be measured by the ratio of effectives to in¬ 
effectives produced, i.e., by h = . Production process 1 may be regarded 

1 — pi 

the more efficient the larger the value of h . Similarly, the efficiency of produc¬ 
tion process 2 may be measured by k 2 = — . The relative superiority of 

1 ~ P2 

production process 2 over the process 1 can then reasonably be measured by the 
ratio of ^2 to fci i.e., by 


(5.19) 


^ kt P2(l - Pi) 
ki pi(l ~ P2) 


If u = 1, the tw^o processes are equally good. If u > 1, process 2 is superior to 
process 1, and if u < 1, process 1 is superior to process 2. Thus, the manu¬ 
facturer will, in general, be able to select two values of u, Uo and Ui say (wo < t/i) 
such that the rejection of process 1 in favor of process 2 is considered an error of 
practical importance whenever the true value of u < uo, and the maintainance 
of process 1 is considered an error of practical importance w^henever u > Ui, 
If u lies between Uq and Ui , the manufacturer does not care particularly which 
decision is taken. 

Clearly, we will always have uo < Ui, If the transition from production 
process 1 to process 2 involves some cost or other inconveniences, it seems 
reasonable to put uo = 1 (or iiq may even be slightly greater than one). This 
choice of uo really means that we consider the rejection of process 1 a serious error 
whenever this process is not inferior to process 2. On the other hand, if the 
transition from process 1 to process 2 does not involve any inconveniences, the 
rejection of process 1 in favor of 2 cannot be a serious error when the two processes 
are equally efficient, i.e., when w = 1. Thus, in such a case, it seems reasonable 
to choose ?/o somewhat below 1. 
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After the quantities Uq and ui have been chosen the risks that we are willing 
to tolerate may reasonably be expressed in the foUowing form: The probability 
of rejecting process 1 should not exceed a preassigned value a whenever u < ih, 
and the probability of maintaining process 1 should not exceed a preassigned 
value whenever u > ui. 

Thus, the risks that we are willing to tolerate are characterized by the four 
quantities uo, ui^ a and /3. After these four quantities have been chosen, a 
proper sequential test can be carried out as follows: The (conditional) prob¬ 
ability that we obtain a pair (0,1), as given in (5.17), can be expressed as a func¬ 
tion of u. In fact 


(5.20) 


^ _ (1 - Pi)jh 

^ Pi{l - Jh) + P2(l - Pi) 


(1 - Pl)p2 

Pl(l - Pi) 

1 r ~ 

Pld ~ P2) 


U 

1 + u 


Let Hq denote the hypothesis that p 


Uo 

1 + Uo’ 


and Hi the hypothesis that 


p = ^ ^ proper sequential test satisfying our requirements concerning 

tolerated risks is the sequential probability ratio test of //o against Hi . The 
acceptance and rejection numbers for this seqiiential test can be obtained from 


(5.9) and (5.10) by substituting — for po — 7 --- for pi and t = ti + tz for m. 

1 4 - Uc 1 i- 

Thus, for each value of t the acceptance number is given by 




(5.21) 


A, = 


log Ui - log th 

and the rejection number is given by 

1 1-/3 

log 

(5.22) Rt - 


a 


+ t 


+ t 


log 


1 -f Ui 
1 + Wo 


log Ui — log Wo 


log 


1 4- W i 
1 4- Wo 


log Wi — log Uq log W1 — log Wo 


These acceptance numbers At and rejection numbers .Ri(^ = 1, 2, • • • ) are best 
tabulated before experimentation starts. The sequential test is then carried out 
as follows: The observations are taken in pairs where each pair consists of an 
observation from the first process and an observation from the second process. 
We continue taking pairs as long as At < tz < Rt • At the first time when tz 
does not lie between the acceptance and rejection numbers, experimentation is 
terminated. Pi;pcess 1 is maintained if at this final stage tz < At, and process 1 
is rejected in favor of 2iitz>Rt- 

The test procedure can also be carried out graphically as shown in Figure 5. 
The total number m of pairs (0, 1) and (1, 0) is measured along the horizontal 
axis. The points (t, At) will lie on a straight line Lq , since A < is a linear function 
of L The points {(, Rt) will lie on a parallel line Li. We draw the lines Lo and 
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Li and plot the points as experimentation goes on. At the first time when 
the point h) is not within the lines Lo and L\ experimentation is teiminated. 
Process 1 is maintained if at the final stage the point ((, fs) lies on Lo dt below^ 
and process 1 is rejected if the point (i, ^ 2 ) lies on Li or above. 

5.3.5. The operating characteristic curve of the test. For any value u of the iratio* 

^ we shall denote by Lu the probability of maintaining process 1. Clearly, Liu 

is a function of u. This function L„ is called the operating characteristic curve 
of the test. The operating characteristic curve can be determined from the 
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These equations are: 
(5.23) 

and 



u 

l~Vu 



( "1“ ^) V _ / 1 t^o V * 

\Mo(l + Wi)/ \1 + Ui) 


(5.24) 
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For any given value h we compute the values of u and Lu from these equations. 
The point (w, L«) obtained in this way will be a point of the operating character¬ 
istic curve. Calculating the points (w, Lu) for a sufficiently large number of 
values of h we can draw the operating characteristic curve. 

5.3.6. The average amount of inspection required by the test For any value u 


of the ratio ~ denote by Eu(t) the expected value of the total number of pairs 

Ki 

(0,1) and (1, 0) required by the test. The value of Eu{t) can be obtained from 


(5.14) by substituting jE„( 0 for Ep{n)^ Lu for Lp , — for pi and — for 

Po. Thus 


(5.25) 


E^t) 


L» log —-1- (1 - Lu) log - - 

_ 1 — a a 

u Ui (l + _ 1 , 1 + Up ' 

1 + U Uo(l + Ui) 1 + W 1 + Ui 


To compute the expected value of the total number of pairs (including also 
the pairs (0, 0) and (1, 1)), we merely have to divide the right side expression in 
(5.25) by piil - P 2 ) + P 2 (l - Pi). 

In the rare event that no decision is yet reached at a number of pairs equal to 
three times the expected value, we can truncate the test at that stage without 
seriously affecting the probabilities of making a wrong decision (see section 4.6 
in Part I). 

5.3.7. Observations made in groups of r. In applications it may happen that at 
each stage in the sequential process instead of drawing a single observation we 
draw r observations from each of the binomial distributions. Hence, instead of 
a single pair, we have two sets of r observations. If the order of observations 
in each such set of r is recorded, we can establish the number of pairs (0, 1) and 
the number of pairs (1,0) for each pair of sets of r observations. In such a case 
the test can be carried out as described in section 5.3.4, since after each pair of 
sets of r observations we can compute t and h . The only effect of taking the 
observations in groups of r is that more observations will generally be necessary 
(approximately enough to fill out a group) and thereby the probability of making 
an incorrect decision will be made somewhat smaller. However, if the order of 
observations in such groups of r is not recorded, the difficulty arises that we are 
not able to determine the values of t and (2 needed for the test prpcedure. It has 
been shown in [7] that in such a case we may replace ( and t 2 by certain estimates 
of t and <2 without affecting seriously the probability of making an incorrect 
decision. The estimates of ti and (2 (and thereby also an estimate of ^ = <i + < 2 ) 
are obtained as follows: I^t ri be the number of successes in the group of r ob¬ 
servations drawn from the first binomial distribution, and let ro be the number 
of successes in the group of r observations drawn from the second binomial distri¬ 
bution. Then for this pair of groups of r observations, we estimate the number 
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of pairs (1,0) to be n — — and the number of pairs (0,1) to be rj — —. Thiys, 

r t ■' T.. 

an estimate of h is obtained by summing ri — over all pairs of groups ob¬ 
served, and that of is obtained by summing r 2 — — over all pairs of groups 

T 

observed. i 

5.4. Application to testing the mean of a normal distribuiion with known stand¬ 
ard deviation. 5.4.1. Formulation of the problem. Suppose that a measurable 
quantity x is normally distributed with unknown mean $ and known standard 
deviation a. For example, z may be some measurable quality characteristic 
of a unit of a certain product where x is normally distributed with a known 
standard deviation in the population of all units. The problem we shall con¬ 
sider here is to test the hypothesis that the unknown mean B is less than a specified 
value This problem arises frequently, for example, in quality control. 
Suppose that the quality of the product is considered the better the higher the 
mean value of x. Thus, there will be a value B^ such that the product is con¬ 
sidered sub-standard ii B < B' and the product is considered to meet specifications 
if d Since B is unknown, we are usually interested in testing the hypothesis 

that B < B\ i.e., that the product is sub-standard. 

Since quality control is an important field of application for such test proce¬ 
dures, the discussion will be continued in the terminology of quality control. 
This, of course, should not be interpreted as a restriction upon the general 
validity and applicability of the test procedure. The problem treated in section 
5.4 can now be stated as follows: Let x be a measurable quality characteristic 
of a unit of a certain product. The variable x is supposed to be normally 
distributed with known standard deviation in the population of all units pro¬ 
duced. The problem is to devise a sampling plan for testing the hypothesis 
that the product is sub-standard. The product is said to be sub-standard, if 
the mean ^ of a; is less than a given specified value B\ 

5.4.2. Tolerated risks for making a wrong decision. No sampling plan can 
guarantee that the correct decision will always be made, i.e., that the product 
will be declared sub-standard if and only if 0 < B\ The larger the amount of 
inspection, the smaller we can make the risks for making a wrong decision. If 
inspection is costly, or destructive, we are willing to tolerate some risks of making 
wrong decisions in order to reduce the necessary amount of inspection. Thus, 
a proper sampling plan can be recommended only after the risks that can be 
tolerated have been stated. 

If the quality of the product is exactly on the margin, i.e., if ^ = B\ then it 
will make little difference whether the product is classified as sub-standard or 
not. However, if B is considerably smaller than B\ then the acceptance of the 
hypothesis that the product meets specifications (rejection of the hypothesis 
that the product is sub-standard) will usually be considered as a serious error. 
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Similarly, if ^ is much larger than the acceptance of the hypothesis that the 
product is sub-standard will generally be considered as a serious error. Thus, 
the manufacturer will, in general, be able to select two values of $o and Bi say 
(^0 < B' and 6i > B^) such that the classification of the product as satisfactory 
(meeting specifications) is considered an error of practical importance whenever 
^ < Bo y and the classification of the product as sub-standard is considered an 
error of practical importance whenever B > Bi. If 0 lies between and Bi , a 
wrong classification of the product will not be viewed as a serious error, since 
in this case B is near the marginal value B'. 

After the two values and Bi have been selected, the risks that we are willing 
to tolerate can be stated in the following form: A sampling plan is required 
such that the probability of classifying the product as satisfactory is less than 
or equal to a preassigned (quantity a whenever B < Bo y and such that the prob¬ 
ability of classifying the product as sub-standard is less than or equal to a 
preassigned quantity 0 whenever B > Bi, Thus, the tolerated risks are char¬ 
acterized by the four quantities Bo, Biy a and A proper sampling plan can 
be devised after these four quantities have been selected. 

5.4.3. A sequential test of the hypothesis that B < B^ {the product is substandard). 
Let Ho be the hypothesis that 6 ^ Bo and let Hi be the hypothesis that B = Bi. 
liCt T be the sequential probability ratio test for testing Ho against Hi such that 
a is the probability of accepting Hi when Ho is true and 0 is the probability of 
accepting Ho when Hi is true. This sequential test will satisfy all our require¬ 
ments, since for this test the probability of accepting Ho (declaring the product 
as sub-standard) is < ^ whenever ^ , and the probability of accepting Hi 

(declaring the product as satisfactory) is < a whenever B < Bo. 

The sequential test T is given as follows: Denote the successive observations 
on a: by , a: 2 , • • • , etc. Accept the hypothesis that the product is satisfactory 
at the w-th observation if 

-(l/2a«) S (*«-^l)2 
€ a-l 

<5.26) log -;;;-— > log -? . 

e a-l 

Accept the hypothesis that the product is sub-standard if 

-(l/2ir*) £ (Xa-90^ 

e a-l S 

(5.27) log--^ 1 ^ ^ 

e «-i 

Take an additional observation if 

-(l/2<r*) 2 

5 e 1 1—5 

(5.28) log —< log - < log- - . 

1 — *• (X 

e I 
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The inequalities (B.26), (5.27) and (5.28) are equivalent to 
5.29) t X. > log LZJ + „ 

atml Pi **“ trO Ot « 


^ Xa < 
a«»l 


^0 + 


^0 + 


< ^ Xa < - - log 

a«-l Pi — PO 


1 - 


respectively. 

Using the inequalities (5.29), (5.30) and (5.31) the test procedure can easily 
be carried out as follows: For each m compute the acceptance number 

(5.32) log + m 

and the rejection number 

(5.33) log + m ?L+1>. 

These acceptance numbers Am and rejection numbers Rm are best tabulated 

m 

before inspection starts. Inspection is continued as long Am < ^ Xa < 

o^l 

m 

Rm . At the first time that does not lie between Am and Rm , inspection 

a«l 

m 

is terminated. If at this final stage 21 < Amy the hypothesis that the 

m 

product is sub-standard Ls accepted, and ii ^ Xa > Rm , the hypothesis that 

a-l 

the product is sub-standard is rejected. 

The test procedure can also be carried out graphically as shown in Figure 6. 
The number m of observations is measured along the horizontal axis. The 
points (m, Am) will lie in a straight line Lo and the points (m, Rm) will lie on a 
parallel line L] . We draw the parallel lines Lo and Li and plot the points 

2^ as inspection goes on. At the first time when the point ^?n, ^ 
does not lie between the lines Lq and Li inspection is terminated. The hypothe¬ 
sis that the product is sub-standard is rejected if the point 23 lies on Li 

or above. The hypothesis in question is accepted if the point ^m, ^ 
lies on Lq or below. 
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5.4.4. The operatinff characteristic curve of the test. For any value $ denote by 
Le the probability that the hypothesis that the product is sub-standard is 
accepted. Obviously, Le will be a function of $ and is called the operating 
characteristic curve of the test. The shape of the operating characteristic curve 
will, in general, be of the type shown in Figure 7. Le approaches 1 as 0 « 

and Le approaches zero as ^ . Furthermore, Le is a decreasing function 

of We already know the values of Le for 6 — $o and ^ . Now we shall 

give a method for computing the value of Le for any 6. If -L-- —? is fairly small, 



Fig. 6 

which will usually be the case in practice, a good approximation to Le is given 
by (see equation 3.35) 

where the constant h is determined as follows; First we compute the character¬ 
istic function (p{i) of the variate 

(6.36) z = log - - [2(0, - 0„)i + <^ - el ]. 
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Thus, z is normally distributed with mean 

iA ' 

— . Consequently, ipif) is given by 


$1 — ^ . ($i — 

2(1^ ^ <j3 




(5.36) <p(t) = €*■ 

The value h is the non-zero real root of the equation <p{t) = 1. 


(5.37) 


{B\ — ^o) — 2(^1 — 


and variance =» 


Hence 


^1+00 — 2B 

Bi — Bq 


The operating characteristic curve can be computed from (5.34) substituting 
the right hand side member of (5.37) for h, 

5.4.5. The average amount of inspection required by the test. Let Ee(n) denote 
the expected value of the number of observations required by the test when B 


Lo 



is the true mean of x. 
Ee(n) is given by 

EeM 


According to (4.8) a good approximation to the value of 

_ 2 L, log + (1 - L,) log -TJ 

^ Za 1 — a __ 

^0 — ^14" 2{&i — ^o)^ 


where Le is given by (5.34). ^ 

Ill the rare event that the number of observations reaches three times the 
expected value before the test is terminated, we can tnmeate the test at this 
stage without seriously affecting the probabilities of making a wrong decision. 
(See section 4.6 in Part I). 


6. Outline of a General Theory of Sequential Tests of Hypotheses when No 
Restrictions Are Imposed on the Alternative Values of the Unknown 

Parameters 

6.1. Sequential test of a simple hypothesis with no restrictions on the altemative 
values of the unknown parameters. Consider the following general case. Let 
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Xi, • • • , Xp be a set of p random variables and let /(xi, • • • , Xp , , • • • , 

be the joint probability density function of these random variables involving k 
unknown parameters , • • • , • Suppose that we wish to test the hypothesis 

Ho that Si = dl j ‘ 6k — bI , where , • • • , ^2 ^re some given specified values. 
Denote the set of all a priori possible parameter points by Q. Assume that S2 
contains at least a finite fc-dimensional sphere with the center (0?, • • * , 

Let U* be the set of all possible alternative parameter points; i.e., Q* is the 
whole parameter space with the exception of the point 6^ = (^i, • • * , 

For any statistical procedure for testing Ho , the probability of an error of the 
first kind, will have a definite value, but the probability of an error of the second 
kind will depend on the true alternative; i.e., it will be a single valued function 
fi(6) defined over all points 6 of fl*. Let w(S) be some non-negative function, 

called weight function, such that / w(S) = 1. Suppose that we wish to 

Jo* 

construct a sequential test such that the probability of an error of the first kind 
is equal to a given a and that the weighted average / w(S)0{B) d{B) of the 

Jo* 

probabilities of errors of the second kind is equal to some given positive value 
This problem can easily be solved as follows: Let pon be equal to the product 

n 

n f(xia , • * * , Xpa , Si,--^ySl) where x,a denotes the ath observation on 

a<«l 

Xi (i = 1, • • • , p; a = 1, • • • , n). Furthermore, let pi„ be defined by 

(6.1) Pm = «-’(«) • ♦ , Xpa y Oi,.. de. 

The expression pi„ can be interpreted as the probability density in the sample 
space of n oV)servations on the variates Xi, • • • , Xp, if we assume that the 
parameter point B in SI* has a probability distribution given by the density 
function w{B) dB. 

We shall denote by Hi the hypothesis that the probability density function 
in the sample space of n observations on , • • • , is given by pm defined in 
equation (6.1). The problem of testing Ho against the single alternative Hi 
is not exactly of the type discussed in Part I, since pin given in (6.1) cannot be 
represented, in general, as a product of n factors where the ath factor depends 
only on the observations Xi« , • • • , Xp« . However, it was pointed out in sec¬ 
tion 3.2 that the fundamental inequalities derived in Section 3.2 remain valid 
alw when pin is given by an expression of the type (6.1). Thus, we can use the 
sequential probability ratio test for testing Ho against the single alternative Hi . 
We reject if 

( 6 . 2 ) ^ > A, 

POn 

we accept Ho if 

(6.3) ^ < B, 

POn 
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and we make an additional observation if 
(6.4) B <^ < A. 

POn 


The expression pin is given by (6.1) and the constants A and B are chosen so 
that the probability of accepting Hi when Ho is true is a, and the probability 
of accepting Ho when Hi is true is Thus, for practical purposes we may put 


^ - axxvi jj — - , 

O' 1 — a 

Using the sequential process defined by the inequalities (6.2), (6.3), and (6.4) 
we obviously have 


(6.5) / w{e)^{e)de = 

Jq. 

where for each point 6 in U*, ^(6) denotes the probability of accepting Ho under 
the assumption that 6 is the true parameter point. 

Thus, the sequent^l test given by (6.2), (6.3), and (6.4) provides a satisfactory 
solution of the problem if we want a test procedure such that the probability of 

an error of the first kind is a and the weighted average / dB of the 

Jq» 

probabilities of errors of the second kind is Practical problems, however, 
do not always take this form. Many instances require a test procedure such 
that ^{B) should be less than or equal to a given positive value 0 for all parameter 
points B whose ‘‘distance’^ (defined in some sense) from B^ is greater than or 
equal to some given positive value do . The ‘‘distance’’ of two parameter points 
B^ and B'^ may be defined by some function 6(0\ B“) which is equal to zero if ^ 

and is greater than zero if B^ Furthermore, for any three points ^ 

we have d(B\ B^) — h{B\ B^) and 6(0\ B^) + 6(0^, B^) > 6(0\ B^). The distance 
function will, in general, be chosen according to practical needs and mathe¬ 
matical convenience. 

Given the distance function 5(^, f) and given the requirements that the 
probability of an error of the first kind be a and the probability of an error of 
the second kind should not exceed whenever the distance of the true parameter 
point from B^ is greater than or equal to do , the aim is, of course, to construct 
a sequential test which satisfies these requirements with a minimum expected 
number of observations. 

While an exact solution of this problem has not yet been found, the following 
approach seems reasonable: Let 12o be the set of all parameter points B for which 
B) > do. We restrict ourselves to the class Cs of sequential tests based on 

the ratio — where 

POn 

n ‘ 

(6.6) Pon “ IT ‘ > ^potj Bi y * • * , 0fc), 

tt — l 

r " 

Pin = / w(e) n / (a:i«, • • •, Xpa , Bi, * • *, Bk) dB 

JQo a-1 


(6.7) 
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and w{e) may be any non-negative function of 8, called wei^t function, for 
which 


( 6 . 8 ) 



For carrying out the sequential test two constants A and B are chosen. 


The 


hypothesis Hois accepted if < B, Ho is rejected if > A, and an additional 

Pon Pon 

observation is made if B < ^ < ^1. The restriction to the class Ca of sequen- 

POn 

tial tests is suggested by the fact that we are led to these tests if it is required 
that some weighted average of the probabilities of errors of the second kind be 
equal to a given value 

Accepting the restriction that the sequential test should be a member of the 
class Cs i we still need a principle for choosing the weight function w(d). It is 
clear that the maximum of ^{d) in ik depends on the qualities A, B, and the 
weight function w(e). Denote this maximum value by jSMaSA, B, w($)]. Since 
it is desirable to make iSmoxIA, B, as small as possible, it is proposed to 
determine wiS) so that the expression /3Max[A, J5,becomes a minimum with 
respect to w{d). Since for given values A and B the value of the weighted 

average / w{d)0{d) dd is practically independent of io(d) (it is nearly equal 

•'Qo 

3 ~ B^^ ' minimizing B, is practically equivalent to mini¬ 

mizing the difference 0m&x[A, J5, w{6)] — f w{6)^{6) dS. For convenience we 
determine w{d) so that /SmexIA, B, ic(^)] — f w{6)^{d) dO becomes a minimum. 

•'Qo 

For this weight function the maximum of 0(6) in Qo will depend only on A and B, 
Denote this value by B). Finally we determine the values A and B so 
that 0(A, B) = ^ and the probability of an error of the first kind becomes a. 

The determination of is a problem in the calculus of variations. In 
some important cases, however, the solution can be obtained by the following 
simple procedure: Let S{d) be the set of all parameter points 6 for which 
, B) = d. Let v{B) be a non-negative weight function defined over the 

surface Sido) so that the surface integral / v{B) dw = 1 (where dos de- 

notes the infinitesimal surface element), (consider the follo^\^ng sequential 
procedure: Reject Ho if 


(6.9) 


f I XT fi^la y * * * > 3/pa , Bi f * * * , ^*) j dci) 

*'8(do) _ L ^ __ J 

IT f (^la f ' * * » ^pa y , * * * , ) 


is greater than or equal to A, accept Ho if (6.9) is less than or equal to B, and 
make an additional observation if the value of (6,9) lies between A and R. The 
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constants A and B are so chosen that the probability of an error of the first kind 
is a and / doj ^ fi. In many statistical problems it is possible to 

^s(do) 

find a weight function v{S) such that for a conveniently chosen distance function 
f) the probability 0(6) of an error of the second kind becomes constant on 
the surface S(d) for any value d, and, furthermore, 0(6) decreases with increasing 
d. For such a weight function the sequential test based on (6.9), will 
provide a solution of the problem. In fact, the weight function v{9) over the 
surface S(do) can be considered a limiting case of a weight function w(d) defined 
in Qo which takes the value zero for any 0 whose distance from 0^ is greater than 
do + A with A approaching zero in the limit. For the weight function v(0) the 
maximum of ff(d) in Qo is equal to the weighted integral of ff(0). Thus, for this 
weight function the difference between the maximum of fi(0) and the weighted 
integral of 0(0) is minimized. 

We shall illustrate this procedure by a simple example. Let Xi, • • • , X* 
be k normally and independently distributed variates with unit variances. The 
mean values 0i, • • • , 0k are unknown. Suppose that it is required to test the 
hypothesis Ho that 0i = • * • = = 0. Assume that the distance of two points 

0^ and 0^ is equal to 

+ V(e[ -ffiy-h ■■■ + (el - el)\ 


Then S(d) is a sphere with center at the origin and radius d. Let t;(^) be con¬ 
stant on Sido) and equal to the reciprocal of the area of S(do). We shall show 
that for this weight function v{6)f 0(6) is constant on the sphere S(d) and is 
monotonically decreasing with increasing d. For this purpose we prove first 
that (G.9) is a monotonically increasing function of a? + • • • + where 
is the arithmetic mean of the observations on Xi . In fact, the expression (6.9) 
becomes 


( 6 . 10 ) 


Ck 


(2t)^ 


f expf — ^ 22 22 (^*« 

^SidQ) L ^ a—I Jj 


1 

(27r)*«/‘ 


®xp [~ 


= Ck exp [— § ndl] / exp dw 

Js(do) 


where Ck is the reciprocal of the area of S(do) and x, is the arithmetic mean of 
the n observations Xia (a = 1, *' • y n). Let denote | x^i | and let 

c^(e) (0 < a < tt) denote the angle between the vector (.f i, • • • , x*) and the 
vector (01 , • • • , 0jk). Then (6.10) can be written 


( 6 . 11 ) Ck exp [ — 1 ndl] / exp (nr,do cos [a( 0 )])dw. 

•'sWo) 

Because of the symmetry of the sphere, the value of (6.11) will not be changed 
if we substitute y(0) for a(6) where 7 ( 0 ) (0 < 7 (^) < ^) denotes the angle 
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between the vector B and an arbitrarily chosen fixed vector u. From this it 
follows that the value of ( 6 . 11 ) depends only on r». 

Now we shall show that ( 6 . 11 ) is a strictly increasing function of r*. For this 
purpose we have merely to show that 

( 6 .t 2 ) I(rx) = f ' exp (nr,do cos [ 7 (^)])dw 

•'fl(rfo) 

is a strictly increasing function of r*. We have 

(6.13) = f ndo cos [ 7(^)1 exp (nr*do cos f 7 (^)])dw. 

arx Ja(do) 


Denote by wi the subset of S{do) in which 0 < y($) < and by «2 the subset 
in which ^ < y(6) < tt. Because of the symmetry of the sphere we have 


J ' ndo cos [ 7 (^)] exp (nr* do cos [ 7 (^)]) d« 

tf a 

* 

= / ndo cos [tt — 7 ( 0 )] exp (nr* do cos [ir — 7 ( 0 )]) da> 

Jo) I 

= — / ndo cos [ 7(^)1 exp {—nrxdo cos [7(^)1) dw. 


Hence 


(6.14) df. 


^ = ndo f cos [7(e)] 

arm Jo)i 

jexp (ndor* cos [y{B)]) — exp (—-ndor* cos [7(^)])j dw 


The right hand side of (6.14) is positive. Hence, we have proved that expres¬ 
sion (6.11) (or (6.10)) is a strictly increasing function of r* . 

To show that /3(0) is constant on S{d) and is monotonically decreasing with 
increasing d, let 2 / 1 , • • • , 2 /* be an orthogonal linear transformation of a:i, • • • , 
so that E(yi) ~ 4 - ... $1 ^ E{y^ = 0 (f = 2, • • • , fc). Since y\ + 

+ y\ — + ' *' + xl and since ( 6 . 11 ) depends only on + • • • + f, 

it is seen that the sequence of expression ( 6 . 11 ) formed for any sequence of 
integers n has a joint distribution which depends only on ^B\ + • • • + 0 *. 
Hence fi{B) is constant on any sphere with center at the origin. Since ( 6 . 11 ) 
is a strictly increasing fun ction of r* , it ca n be shown that iS(^) is a monotonically 
decreasing function of ^Bl + • • • + bI . Hence, we can test the hypothesis 
Ho by the sequential process based on ( 6 . 10 ). 

If A; = 1—that is, if we test the mean value of a single normal variate—the 
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sphfiie i8(^) is a O-diiiiensioiial sphere consisting of the two pcnnts Bi » +4 and 
$1 *=* —d and ex^nession (6.10) reduces to 

JS«(a;« - dof] + exp f—+ do)*]} 

*= i exp [—ind?]{exp [nido] + exp [—n:fdo]}. 

6.2. Sequential test of a composite hypothesis. We shall give only a brief 
outline of the principles on which a sequential test of a composite hypothesis 
can be based, since they are analogous to those for a simple hypothesis. Let 
Xi, • • • , Xp be a set of p random variables and let fixi , • • * , a:p , , • * • , ft) 

be the joint probability density function of these variables involving k unknown 
parameters , • • • , ft . Denote the set of all possible parameter points 0 = 
(ft I • * • , ft) by ft. Suppose that we wish to test the hypothesis Ho that the 
true parameter point $ is contained in the subset ca of ft. Let « be the set of 
all points of ft which are not contained in a. Furthermore, let WoiB) and tui(^) 
be two non-negative functions of called weight functions, such that 

(6.16) f Wo{B)dB « 1 and f Wi{B)dB =» 1. 

Ju ’Sf 

If (0 is a surface in the space ft then the integral over ci) is meant to be the surface 
integral over w. 

In testing a composite hypothesis the probability of an error of the first kind 
need not necessarily be the same for all points ^ in «. It ^vill, in general, be a 
function a{B) of the true point 0 in «. Similarly the probability of an error of 
the second kind is a function p{B) of B defined for all points in «. Suppose that 
we wish to construct a sequential test such that the weighted average 

w(B)a(B) dB of the probabilities of errors of the first kind is a given value 

a, and the weighted average / w{B)fi{B) dB of the probabilities of errors of 

the second kind is a given value jS. Then the following sequential test can be 
used: Denote by Ht the hypothesis that the probability density in the sample 
space of n observations on Xi , * • • , Xp is given by 

(6.17) Pon f W\)(0)[XI /(®la > ’ * ' > > ft > * * * > ft)] dB 

a 

and by Ht the hypothesis that the density in the sample space is given by 

(6.18) Pin ** f • * • > » ft» ‘ > ft)] diB* 

The sequential probability ratio test for testing against the single alternative 
H\ provides a solution of our problem. If the constants A and B in this sequen- 
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tial test are chosen so that the probability is a that we reject H* when H* iB 
true, and the probability is fi that we accept Ht when Hi is true, then for this 
sequential test we have 

f Wo($)a($) d$ ^ a 

Jtt 

and 

f Wi(e)p($) dd » p. 

This can be proved in the same way as the corresponding statement in the case 
of a simple hypothesis. 

Frequently we may require a sequential test procedure such that the least 
upper bound of a{6) in «is equal to a given a and fi{d) is less than or equcd to a 
given for all points $ whose ^^distance” (defined in some sense) from ta is greater 
than or equal to a given positive value do. The “distance** of a parameter 
point d from u may be defined by some function 5(0, ui) which is positive if $ 
is not in u) and is zero if 0 is in ca. The distance function will be chosen in general 
according to practical needs and mathematical convenience. For reasons simi¬ 
lar to those discussed in the case of a simple hypothesis, an appropriate sequential 
test procedure with the desired properties can be found as follows: Let «(d) 
be the set of all points 0 for which 5(0, <a) > d. Let, furthermore, Wo{$) and 
«?i(0) be two weight functions such that 


(6.19) 


f Wo(6) dd = f Wi{d) dB 
JsidQ) 


1 . 


Denote by Ht the hypothesis that the probability density in the sample space 
of n observations on Xi, • • • , Xp is given by 


( 6 . 20 ) 


Pon 



d0 


(n « 1,2, •••) 


and by Hi the hypothesis that the probability density in the sample space of n 
observations on Xi, • • • , Xp is given by 


( 6 . 21 ) 


Pm 


= [ wi(0)rn/(xi«, •••,a:p«,0)ld0. (n = l,2, •••) 

•'wCrfo) J 


Consider the sequential probability ratio test for testing the simple hypothesis 
against the single alternative Hi . For any 0 in w let a(0) be the prob¬ 
ability of accepting Ht when 0 is true, and for any 0 in w let 13(d) be the prob¬ 
ability of accepting Ho when 0 is true. It is clear that fl£(0) and 13(d) depend on 
the constants A and B used in the sequential process and on the weight fimctions 
wo(d) and w?i(0). For given A, B, wo(d) and Wi(d) let p[A, B, wo(d), Wi($)] be the 
least upper bound of fi(d) in w(do) and let a[A, B, tPo(0), Wi(0)] be the least upper 
bound of a(0) in < 0 . Consider the difference ^ 

Act[A, B, v)o(d), Wi(0)] = a[Ay B, Wo(d), Wi(0)] - J Wo(d)a(d) dd 
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and 

Aff[A, B, Wo(e), WiiO)] - /3U, B, w>oW, m?x(^)] -> [ Wi{e)m dB. 

JzidQ) 

Determine Wo(B) and Wi{B) so that Max [Aa, A/S) is a minimum. For tliese 
weight functions the least upper bound of a(d) in o) and the least upi>er bound of 
fi{6) in 6>(do) will be functions of A and B only. Finally, wc determine A and B 
so that the least upper bound of aid) in w becomes a, and the least upper bound 
of PiB) in a?(do) becomes 0. 

The determination of WoiB) and involves the solution of prcjblems in 
the calculus of variations. However, in some important cases the solution of 
the problem can easily be derived, since weight functions w;o(^) and WiiB) can be 
found for which Aof = AjS = 0. Such a situation is given, for instance, in tlie 
following case: Let Sid) be the set of all points B for which 6(0, w) = d. Suppose 

that we can find two weight functions v^iB) and viiB) such that j t'o(0) dB — 
/ ViiB) dS = I ids denotes the infinitesimal surface element of Side)) and 

ds(do) 

the seciuential probability ratio test based on 


[ W[n fi^la , 0)1 dS 

•'.sCrfo.l a 

f /(^i., • • •, j:,;. , »)] de 

dta Ct 


has the following properticvs: (1) a(0) is constant in w; (2) /3(0) is constants on 
Sid) for any d > do ] (3) d(^) is strictly de(;reasing with increasing d in the 
domain d > do • Then for these weight functions we evidently have Aa = 
A/3 = 0, 

liCt us illustrate this by a simple example. Let A" be a normally distributed 
variate with unknown mean* /x and unknown variance a\ Su]ipose that we 
w’ant to test the hypothesis //o that /x = 0 and that the distance of the point 

I 

(/X, or) from the set w is defined by - i. 

o-| 

The set Sid^) then consists of all points (/u, <t) for which /lx = +do<7 or /x = —d^. 
The set w consists of all points (0, a) where a can take any arhitrarj^ positive 
value. Let r be a positive value. We define the weight funotionuS ^’or(flr), and 

VirM as follows: VorM = - if 0 < O' < r and ccjuals zero for all other values of 
r 

<T, The weight function rir(cr) is equal to ^ if 0 < a < r and m == dzdac and equal 


to zero otherwise. 
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Hence 


^ ■ L, (JsW- “p [- 5 * 

1 1 f 1 r 1 s(a:. - A<r)n 

= i? ‘*^["2 -^-J 

p“ ■ (2^"; {i ,“• “p [" 5^l' 

' r i [■-' 5<£.-.9 *i),’] * 

'In _ 2 Jq (t" _L 2__ J 

iT “ ri 'T' lircH, 

L 2 ?-J* 


fi „„r i s(».+A»)’ 

io or” 2 _ ^ _ 

in. f 1 SXa"] , 

Jo a” L 2 <r^ J 


We consider the limiting case when r —♦ oc.. Then 


] /■“ 1 r 1 S(a:« - (hcf' 

^ = 24 ;^^^PL~2—— . 


iri r is(a;<. + d„<r)n, 

2 4 

r 1 r 12xn , 


The sequential test based on the ratio (G.25) provides a solution of the problem 
if it can be shown to have the following three properties: (1) a{6) is constant in 

w; (2) fi{6) is only a function of ; (3) ^{6) is monotonically decreasing with 


increasing . Denote —^— by x and {Xa — x)^ by Since the dis- 
(T j n a-l 

tribution of depcmds only on ^ , the first two properties are proved if we 
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show that the ratio (6.25) is a single valued function of 


X 


First we show that the numerator of the ratio (6.25) is a homogenous function 
of (xi, ••• ,Xn) of degree — (n — 1). In fact, making the transformation 
(T = we obtain 


n i r 1 x(\xa - do<r)n ,1 r 1 + do<r)n\. 

n i r 1 s(a:. - Aon . 1 r > s(t. + *0*11 

p5j-.exp[- 5-p- J + 2 —j.-]}-'<«> 

1 r/i r 12(a;. - d,j)n . j r i sc*.+ rfoon\j, 

- )f=iX \p“"L 2 ~-p—J + r-2 — e — J/*■ 


This proves that the numerator of (6.25) is a homogenous function of —(n — 1) 
degree. Similarly, it can be shown that th<», denominator of (6.25) is also a 
homogenous function of degree — (n — 1). Thus the ratio (6.25) is a homog¬ 
enous function of zero degree in the variables Xi, • • • ^ Xn * 

It can be seen that (6.25) is a function of the two expressions Xxl and Xxct 
only; i.e., 


(6.26) 

Let V = i I . Since ((>.26) is a homogenous function of zero degree, its 

value is not changed by substituting for x„ . Hence, 


(6.27) 

Pon 

Since </>(2x« , — ^^-a) 


, Xxa), we see that 



ixf 


Since ^ is a single valued function of 


, we have proved that 


is a single 

POn 


valued function of — 

I o 

In order to prove property (3) of the sequential test based on the ratio (6.25), 

I X 


Since ~ is a strictly increasing function of , wc have only to show that 

X 

(6.25) is a strictly increasing function of -- . The latter statement is obviously 
proved if we show that (6.25) increases with increasing value | x j while keeping 
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V fixed. For fixed value of v the denominator of (6.25) is constant. Thus, we 
have mgrely to show that the numerator of (6.25) increases with increasing 
I X ( while keeping v fixed. This follows easily from the fact that 

is a strictly increasing function of [ x | . 
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NON-PARAMETRIC ESTIMATION. I. VALIDATION 
OF ORDER STATISTICS 

By H. Schepf^ and J. W. Tukey 
Syracuse University and Princeton University 

1. Summary. Previous work on non-parametric estimation has concerned 
three problems: (t) confidence intervals for an unknown quantile, (tt) population 
tolerance limits, (m) confidence bands for an unknown cumulative distribution 
function (ce//). For problem (m) a solution has been available which is valid 
for any cdf whatever, but for (i) and (ii) it has heretofore been assumed that the 
population has a continuous probability density. This paper validates the 
existing solutions of (i) and (ii) assuming only a continuous cdf. It then modifies 
these solutions so that they are valid for any cdf whatever. 

2. Introduction. There are three problems of non-parametric estimation 
(we exclude point-estimation) for which fairly satisfactory solutions are available; 
their present status Avas summarized in a recent paper [4]. The purpose of this 
series of articles is to extend and complete the theory of non-parametric estima¬ 
tion in directions of both theoretical and practical interest. 

In this series we shall employ the following conventions of notation: We dis¬ 
tinguish between a random variable and an arbitrary point in the Euclidean 
space containing its domain by using a capital Roman letter for the former and 
the corresponding lower case Roman letter for the latter. Thus if X is a (scalar) 
random variable, and r a real number or ± , we speak of the probability that 

X^ < r and denote it l)y PrjX < a:) . Roman capitals will also be used to denote 
cumulative distribution functions' (cdfs) : A monotone non-decreasing function 
F(x) will be called the cdf of -Y if F(x 0) = Pr\X < x}. The definition of 
F(x) at its points of discontinuity will be immaterial. Again, E = (Xi, • • • , 
Xn) will denote a random sample from a population with cdf F(x), whereas e = 
(.?’i y • • ‘ j Xn) will denote a point in the sample space /?n . • If t is a function of e 
only, t = <p{e), then the random variable T = ipfE) is a statistic. The order 
statistics of the sample E are defined to be — qc , Zi, • • • , Zn , + «», where Zi < 
2 o < • • • < is a rearrangement oi xi, X 2 , ' •' , Xn . We shall write Zo = 
— oo, Zn 4 i = -|- oo. The device of including -h and — cc among the order 
statistics will enable us to avoid special statements to cover the case of one-sided 
estimation. Confidence coefficients wdll be denoted by 1 — a. Finally, it will 
be convenient to symbolize^ the following three classes of cdfs: S2o is the class of 
all univariate cdfs F; Qj, the class of all continuous F; 124 , the class of all F with 
continuous derivative F'(x). 

^ One of the authors wishes to point out the need of a clear, concise, and adequate term 
for this basic and important concept. 

* The notation follows [3]. 
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We now list the three problems. In each case it is understood that the solu¬ 
tion sought is to be valid for all cd/’s in some chosen class. The names^ asso¬ 
ciated with the problems ai*e (i) W. R. Thompson, K. R. Nair, (it) Wilks, (in) 
Wald, Wolfowitz, Kolmogoroff. 

(i) To find confidence intervals for an unknown quantile Qp , where Qp is 
defined by F{qp) = p, 0 < p < 1; in other words, to find statistics Ti , T 2 such 
that^ 

(1.1) Pr{T,<qp<T2\F] = 1 - a. 

(n) To find tolerance limits Ti , T 2 which, with confidence 1 — a, will cover a 
proportion h or more of the population, that is, 

(1.2) Pr[F(T2) - FiTi) > 6 | F| = 1 ~ a. 

(m) To find a confidence band for an unknown cdf F, that is, a random region 
R{E) in the x^y-plme such that 

(1.3) Pr[R{E) covers gr | F} = 1 — a, 
where g is the graph of p = F(x). 

The existing solutions of problem (m) are known to be valid for F in ih , 
but those of problems (z) and (n) have been validated only for F in 124 . The 
extension to F in Q 2 is an immediate consequence of the theorem in section 4; 
this section also contains a discussion of some other implications of the theorem. 
In section 5 the appropriate modifications of the solutions of problems (i) and 
(w) are found which extend their validity to the general case F in 12o. WTiereas 
Pitman ([1]; also [4], p. 310) has shown how non-parametric tests may be ex¬ 
tended to the possibly discontinuous case, the only solution of the throe estima¬ 
tion problems previously extended to this case is that of Kolmogoroff for problem 
(in). Extension from 122 to 12o is of considerable practical interest, not only 
in the case of populations ordinarily considered discrete, but also as affecting 
the problem of the finiteness of the number of significant figures in measurements 
and the resulting occurrence of ^‘ties’’ in ranked measurements. Before making 
these extensions we discuss in the next section the transformations on which 
they are based. 

3. Two useful transformations of random variables. We shall reserve the 
symbol X* for a random variable having a uniform distribution on the interval 
from 0 to 1. Its cdf is 

r0ifx*<0, 

(1.4) U(x*) = Pr{X* < x^} = U’* if 0 < .T* < 1, 

11 if a:* > 1. 

® For bibliography see (41. 

< The notationFr IF | Fo) denotes the probabilityof the relation R being true, calculated 
under the assumption that the cdf of the population is Fo(x). 
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The device of transforming from any random variable X with F in O 2 
to one with cdf U was early used by Karl Pearson and more recently by many 
others; it is known in the literature as the ‘‘probability integral transformation,” 
We define the transformation x* = hpix) as follows: For —00 < x < + 00 , 
hjt(x) = Fix), hfi+co) = + X, 00) = — 00 . If F is in Cla, the following 

statements are evident for the transform X* = hriX): X* has Uix*) as its cdf. 
With X* « hriXi), a random sample E = (Xi, • • • , Xn) from F transforms 
into a random sample E* = (Xf, • • • , Xt) from 17. The order statistics 
{Zi} of E transform into the order'statistics {Z*\ of E* with Z* = hriZi), 
t = 0, 1, • • • , n + 1. 

It is easily seen that if F is not in Q 2 , the above transformation Y = h/iX) 
does not give Y the cdf U \ indeed, if F is not in Qa, the cdf of any single-valued 
function F of X is also not in fla, for there will be at least one point x = Xo with 
positive probability, and likewise for its transform yo . Nevertheless our argu¬ 
ments in section 4 depend on relating a random variable with arbitrary cdf F in 
Qo to the uniformly distributed X*. While it is not possible to transform from 
X' to X*, without introducing a further random process, it is possible to transform 
directly from X* to X. This suffices for our needs. We shall always denote 
this transformation by X = gpiX’^), The following definition of the function 
X = Qrix*) makes it independent of the normalization of F at its discontinuities: 

(1.5) Fix - 0) < I7(x*) < Fix -t- 0). 

A sketched diagram may aid the reader in following the argument: To every 
X* (— X < X* < + oc) there corresponds at least one x, and this x Is unique 
unless it lies in an interval to which F assigns zero probability. In the latter 
case we shall assume that some x in the interval is designated to be g'jp-Cx*). It 
will be seen that it is immaterial which x is thus chosen. However if x = — 00 
or + ^ is in an interval of constancy of F we specify ») = (+<») = 

+ « . 

To prove that griX*) has the cdf Fix) and thus can be identified with X, it 
is sufficient to prove that Pr < x) = F(x + 0). Now gpiX*) < x if and 

only if X* < x* , where 

x+ == sup X*. 

Hence Pr{gAX*) < x] = Pr\X* < xl) = U(xl) = F(x + 0). It foUows 
that a random sample E* from U transforms into a random sample E from F. 
The transformation preserves the relation that is, if x® = gFixa)y Xh = 

gpixt), then xt < xt implies Xa < Xh. This means that the order statistics 
{Zjj of E* transform into the order statistics [Zi] of E. We remark that 
xt < xt does not imply Xo < x^; there is trouble when xt < 0 ov xt ^ 1, and 
more serious trouble if x^ and xt both go into the same discontinuity of F. 
However, we shall need to utilize the fact that Xa < xt implies x* < x? . 
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4. Extension to continuous cdTs. A sufficient condition on Ti and T 2 for a 
solution (1.2) of problem («) to be valid for all F in 122 is clearly that the joint 
distribution of F{Ti) and F{T 2 ) be independent of F in Q 2 . If Pr{F{Ti) = 
p\ F] =0 {i = 1, 2), then (1.1) is equivalent to 

(1.6) Pr{FiT0 <v< F{T 2 ) | F( = 1 - a, 

and so a sufficient condition that a solution (1.1) of problem (i) be valid for all 
F in 122 is again that the joint distribution of F{Ti) and F{T2) be independent of 
F in 122 . We are thus led to consider sufficient conditions on a set Ti, T 2 , • • • , 
Tr of statistics, which will insure that the joint distribution of F{Ti)y F{T 2 )f 
• • • , F(Tr) be independent of F in 122. 

Theorem; A sufficient condition for the joint distribution of F{Ti)j F(T 2 ), • • • , 
F(Tr) to he independent of F in 122 is that the { Tj\ be a subset of the order statistics 
{Zi\ of the sample. 

To prove the theorem it will suffice to show that the joint distribution of the 
set of n random variables F(Zi), F(Z 2 ), • • • , F(Z„) is indei^endent of F in 122 . 
Let the cdf of the joint distribution be 

(1.7) G^(Xi, X 2 , • •' , Xn) - Fr|F(Zi) < Xi, • • • , F(Z„) < X„ | FI. 

Emplo 3 dng the transformation x* = hy{x) discussed in section 3, we see that the 
above probability equals 

(1.8) * Pr\Zt < < X„l, 

where Zt, Zt, • • • 1 Zt+i are the order statistics of a random sample E* from the 
uniform cdf U. But this probability does not depend on F. 

Since the existing solutions of problems (i) and {ii) are obtained by taking 
Ti and T2 to be order statistics, we have validated these solutions for all F in 
122 . That the existing solutions of problem (m) are valid for F in 122 has been 
demonstrated by their authors; this is however also an easy consequence of the 
above theorem. The sufficiency condition expressed by this theorem together 
^vith a necessity condition of Robbins’ [2] may indicate a natural path to the 
formulation and solution of further problems of non-parametric estimation. 

From a theoretical point of view it is of interest to note that even in those 
pathological cases where no probability density function exists for the cdf F 
in 122 (F is non-absolutely continuous), the joint distribution (1.7) of F(Zi), 
F{Z^, . • • , F(Z„) always possesses a density. That this density is n! for 0 < 
F(Zi) < F(Z 2 ) < • • • < F(Z„) < 1, and zero elsewhere, is evident if we consider 

(1.8) . By “integrating out” the other variables we are led to the following 
practically useful result (it is well known for F in I 24 ): Choose any set {r^l 
of 8 integers (1 < ri < r 2 < • • • < r, < n), and consider the joint distribution 
of F{ZrJ, F{Zr^), ••• , F(Zr,). This has a probability density function fih, 
Ui * • * jQi providing F is in 122 , given by the formula 
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(1.9) 

for 0 < <1 < fa <•••</,< 1, and / = 0 elsewhere. As is conventional, the 

0 

result of appl 3 ring 11 is to be interpreted as unity, and the meaning of / is 

t-L 

given by 

Pr{F(Zr,) <o,(i = 1,2, •.•,s)lF} 


n ot rog 

••• / ,t.)dt,---dkdli. 

00 *^00 


5. Extension to discontinuous cdf’s. Suppose we have a solution of problem 
{i) based on order statistics and hence valid for F in Qa, say, 

(1.10) Pr[Zt<q^<Zt\F\ = 1 - a, 

where Q<k<t<n+l. In particular this is valid for the uniform case, 

(1.11) Pr{Zt < P < Zf} - 1 - a. 

We now transform from the uniform cdf U to an arbitrary F in Ko by means of 
the transformation x = grix"^) described in section 3. Suppose qp is defined 
by qp = gfip)- This means the quantile qp of the distribution with cdf F is 
determined from the relation 

f’(9p - 0) < p < F{qp + 0), 

which assigns to the quantile its usual meaning if F(x) is continuous and non¬ 
constant at X = , and a sensible definition if F is discontinuous or constant 

at qp . From the discussion in section 3 we have 

{Zk < qp < Zt) implies {Zt < p < Z*) implies {Zk < qp < Zt), 

and hence the probability relations 

Pr{Z, < 9 p < Z, I F( < Pr{Z? < p < Zf 1 < Pr{Z* < g, < Z, | P}. 

Substituting (1.11), we have 

(1.12) Pr{Zk < qp < Z^IF} < 1 - a < Pr{Zk < qp < Z, I F}. 

The statistical interpretation of (1.12) is the following: Consider any solution 
(1.10) of problem (i), giving a confidence interval for the quantile qp , valid for F 
in ^ 2 . Then with the same values of n, fe, f, and a, the probability of the random 
interval from Zu to Zt covering the unknown quantile qip is < 1 — a for the open 
interval, >1 — a for the closed interval, no matter what the unknown cdf F, 
If F is continuous, the two probabilities are of course equal. 
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To extend the solution of problem (ti) to the general case F in Qo, suppose we 
have a solution (1.2) using order statistics, say Ti = Zk y T 2 Zt (0 < k < t < 
n + 1). Such a solution will be valid for all F in Ife , in particular for F ^ U, 

Pr{ U(Z*) U(Zt) > b} = 1 - a. 

Given now any arbitrary distribution F, we again use the transformation x = 
gr(x*). From (1.5), 

F(Zi - 0) < U(Z*) < F(Zi + 0) a = k,t). 

Hence 

B- < B* < B+, 

where 

= F(Z, - 0) - F{Z, + 0), 

B* = U{Zt) - U{Zt), 

B+ = F{Z, + 0) - F(Zi - 0). 

The implications 

(F- > h) implies (F* > h) implies (F+ > h) 
yield the relations 

Pr{F_ > h} < Pr{B* > h] < Pr{B+ > h}. 

These may be written 

(1.13) Pr[F{Zi - 0) - F(Z, + 0) > b | F} < 1 ~ a 

< Pr{F{Zt + 0) - F{Zk - 0) > h\F\ 

To interpret (1.13), let us say that a Borel set S covers a proportion ir of a 

population with cdf Fix) if / dFix) = tt. If ^ is an interval from x' to x", 

then the proportion covered by S is F(x" + 0) — F(x' — 0) if 5 is closed, and 
F(x" — 0) — Fix' + 0) if >S is open. The proportion covered by a point xq 
is the jump F(xo + 0) — F(xo — 0) of the cdf F at xo . The statistical meaning 
of (1.13) is now clear: For the random interval from Z* to , the probability 
that the open interval cover a proportion > h of the population is < 1 — a, the 
probability that the closed interv^al cover a proportion > h oi the population is 
>1 — a, regardless of the population. Again, for a continuous F the two 
probabilities are equal. 
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ON A TEST FOR RANDOMNESS BASED ON SIGNS OF DIFFERENCES^ 

By Henry B. Mann 
Ohio State University 

!• Introductioii. It has been pointed out by J. Wolfowitz [1] that we cannot 
expect a test for randomness to be most powerful with respect to every possible 
alternative. It is therefore necessary to find tests designed to distinguish a 
random sample of observations from the same population from a sample coming 
from some particular class Q of distributions. Such a test need be consistent 
in the sense of Wald and Wolfowitz [2] only with respect to alternatives in the 
class Q. 

Let ail, • • • , Xn be the measurable quality characteristics of n units of a 
manufactured article. We shall assume that the distribution of x< is continuous. 
According to Shewhart the production process is termed ‘‘under statistical 
control” if xi, • • • , Xn can be regarded as a random sample of n independent 
items each coming from the same population with known or unknown distribu¬ 
tion function. 

In a random sample Pi = p{xi > x<+i) = J, w^here P(E) denotes the prob¬ 
ability that E ^vill hold. The class 12 of alternatives which we shall consider is 
described as follows. The cumulative distribution of x,- is and the /,•, i = 
1, 2, • • • , are such that 

t—n 

Pi ~ ^ “1“ “ ^n(^n — l)j lim mf Xn X ^ 0. 

t—» n-*oo 

Such a situation may, for instance, obtain of the production process is under 
statistical control except for occasionally but not too infrequently occurring 
periods during w’hich the quality of the product decreases, after which decrease 
statistical control is immediately restored. If the decreases in quality are sharp 
enough or the periods of decrease long enough, then the alternative will belong 
to the class S2 described before. 

To give a practical example; consider a drill, wliich after some period of use will 
w^ear off so that the quality of the manufactured article will decrease until the 
drill is exchanged. After replacement of the drill by a new^ one, statistical con¬ 
trol is immediately restored. Now, if the drill is not replaced in time, the 
I>eriods of decrease in quality will be long and the rate of decrease will become 
rapid so that the sequence of distribution functions will satisfy the conditions 
of the class Q. A similar situation occurs also in time studies. For instance, 
in the foregoing example, the time necessary for drilling one hole will tend to 
increase when the drill is too long in use. 

The following test first proposed by Moore and Wallis [3] for the study of 

^ Research under a grant of the Research Foundation of the Ohio State University. 
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economic time series seems appropriate for our purpose; Let xi , • • • , Xn be the 
sample and form the sequence xa — xi , • • • , Xn — x»-.i . Let S be the number 
of negative differences in this sequence. Clearly, the distribution of S is in¬ 
dependent of the distribution of x< provided the sample is an independent random 
sample from a continuous distribution. Under one of the alternatives of the 
class 11, jS will in a sample of n tend to be larger than in a random sample if X„ > 0. 
Hence S may be used as a statistic to distinguish between randomness and any 
of the alternatives of the class 12. The distribution of S was tabulated by 
Moore and Wallis [3] for n < 12. They also found empirically that S approaches 
a normal distribution. The asymptotic normality of the distribution of S 
can be proved rigorously in a way analogous to the proof of Theorem 1 of a 
paper by Wolfo\vitz [4]. The first four moments of S were obtained by Moore 
and Wallis. The fourth moment, however, only by empirical methods. In 
this paper we shall derive a formula which makes it possible to compute the 
moments of S recursively. With the help of this formula we shall indicate an 
alternative proof of the asymptotic normality of S using the method of moments. 
Finally, we shall derive a lower bound for the power of the S test with respect 
to alternatives in 12 valid for large 7i and depending only on Xn . 


2. The moments of S; Let Pn(S) be the number of permutations in n variables 
with S negative differences. MacMahon [5] has shown that 

(1) Pn(S) = (S + l)Pn-l(S) + (n - S)Pn^l(S ~ 1). 


Using (1) Moore and Wallis [3] have tabulated PI 




> s- 


In using their table for our purpose, one has to keep in mind that we are using 
a one tail region; therefore P{S > S) is for S > — - — one half of the value 
tabulated by XIoore and Wallis, 

Clearly the first moment of & is —-—, since the expected value of ~ signs 

£i 

equals the expected value of + signs. To find higher moments we multiply (1) 
by(s- divide by n \ and sum over S. Then we obtain 

( 2 ) E. [(s - i^)'] - 1 [(s - + 1)] 


where Pn[/(S)] denotes the expectation of j{S) in permutations of n variables. 
From (2) we have 
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Putting S — E(S) = X we obtain 



(3) £„(x‘) = i [x(x - iY - x(x + §)‘l + [(X + iY + (x - }/]. 


From the symmetry of the distribution as well as from 3 it may be seen that 
all odd moments are 0 and therefore 


mx + 'iY' + (x - if] = E(X + if 
E[x(x - if - x(x + if] = -2E[(x + if*^] + E(x + if. 
Hence we obtain from 2 

(4) £n(x“+*) =0, i = 0, 1, ••• 

% 

En(x“) = £„-l[(x + i)“] - - £n-l[(x + §)“+*]. 

n n 


If all moments below the 2fth moment are known (4) becomes a difference equa¬ 
tion whose solution yields the 2iih. moment for n > 2i, Thus one obtains 


-2 - j? /J^\ _ ^ 1 TP _ 5(n + 1)^ — 2(n + 1) 

<^nw) — En\X ) — , En\X ) ■— 2^ * 

jp 42(n + 1)* + 16(n + 1) 

E„{x) - - ---. 


E 

It is not difficult to prove from (4) by induction that lim-^^— = {2i — l)(2i — 3) 

n-*oo(T^(jS) 

‘ • 3.1. To do this one proves first by induction that En(,x^') is for n > 2i a 
polynomial in n of degree i. It can then be proved by induction that the first 
coefficient of this polynomial is (2i — l)(2i — 3) • • • 3.1/12* from which the 
assertion follows. Since (2f — 1) • • • 3.1 are the moments of a normal distribu- 

(S-!L^1)VI2 

tion with variance 1 it follows that ^- is in the limit normally 

Vn + 1 

distributed with mean 0 and variance 1. This result follows, however, also 
easily from Theorem 2 of a paper by Wolfowitz [4]. 
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It is also possible to show by induction from equation (4) thatforn > 2i the 
2<th moment of S is smaller than the corresponding moment of a normal distribu- 
n + 1 


tion with variance 


12 


3. The power of the S test. Let us assume now that one of the alternatives 
of the class Q is true. This is to say = P(x,- > x<+i) = J + €<, 
2 ~ — 1), lim inf Xn — X > 0. Let 

1 if the tth sign is —, 

0 if the ith sign is +. 


We shall show that 


We have 


= 1 1 - 1) ^ P(ZM = 1). 


r rdMxs)] f df^ix,) < rdj.ix,) f df2ix2) rdMx,) 

I JL.QQ j—ao J *'Xi J—96 *'Xl *^00 

< / dfiiXi) r f dftiXi) f dftix»). 

Adding f dfiix^) f f df2{x2) f dfzixs) to both sides of this inequality we 
A-bo L^—00 J—QO J 


have 


r dMx,) r dMx,) < r ds^) r dux,) r dMx»). 

4-00 4-00 4-00 L4-.B0 4—00 

Integrating both sides with respect to Xi, we obtain 

f d/i(xi) f dMxi) f dfaixi) 

4-00 4-O0 4-00 

< dfi(xi) d/aCxs)j 


or 


P(zi = 1 and Z 2 = 1) < P(zi = 1) *1^(22 = 1). 


From this it follows that or* < 0. Since = i — we have (r, < 

€< ^ ““ Moreover E(S) = + Xn(n - 1), 

Let X' = X if X < § and 0<X'<XifX = §. The critical region i s for suffi¬ 
ciently large n given approximately by S > - + t ^ , where t 
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depends on the level of significance a and must be chosen so that 

e"**** dx = a. Hence, if we can show that under any alternative H of the class 
fl and for any « > 0 

(6) P(S > EiS) - i V(n - 1)(1 - 4\'*)) ^ ; 

for every t > I > 0, n > Nie, H, T), then we shall be able to give a lower bound 
for the power of the S te st. Th e power of the S test is approximately ^ven 

From (5) we have 


f tVn + 1 - 2\„(n.— 1) Vs 
VS(n - 1)(1 - 4X'») 


■*■** dx — t 


\/n+l~2Xn(n~l)\/8 

V(8n^8T(l^^*) 


< —? < 0, n > N{€f H, Z). 


The author considers it safe to assume that (6) holds with a fairly small € for 
n > 12 if X' in (6) is replaced by Xn where Xn = Xn if X« < | and Xn < | if Xn ~ J 
and if Xn is not too close to He bases this belief on the rapidity with which 
the distribution of S approaches normality under the null hypothesis of random¬ 
ness, and on the fact that at least under the 0 hypothesis the moments of S are 
smaller than the corresponding moments of a normal distribution. It may also 
be seen from the following derivation of (6) that in many cases the power of the S 
test will be considerably above the lower bound given in (6). 

To prove (5), we need the following two lemmas 

Lemma 1. Let P{x ^ t) = f{t). Let further E(z) = 0, E{:^) = c. Then for 
every S > 0 

(7) fit + 8)+^>P(z + z<t)^ fit -S)-~. 

Peoop: Applying Tschebycheff’s inequality we have 
P(x + z < t) < P(x < t + S) + P(x > t + S mdz < -8) 

<P(x<t + S) + Piz < -i) < fit +8) +L 


Pix + 2 < i) > Pix < t — 5 and 2 < 8) 


> P(x < « - 8) - Piz > 8) > fit - 8) - 
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Lemma 2. Le< t = 1, 2, • • • 66 a sequence of independent random variMes 
with mean 0 hounded kth absolute moment, k > 2, and variance « Let M > 0 

ll.f2 ^ -T - ^ ®l+-'-+iCn 


and lim sup ^ M , Form the sequence of random variables Pn ’ 
n -*80 n 

then for any € > 0 and any t > I > 0 


M y/n 


( 8 ) 


P(y« < - 0 < 


y/^ JLoo 


dx + € for n > N(e, ?). 


Proof. Form a sequence m* with lim m* *= 0. Let 2/» =* «* + 


where 





denotes summation over all i for which a] > ma and all sums extend from 
one to n. 

Let f^ be the distribution of Xn then by Lemma 1 


r»(-« +«) + ^ > ^ -0 > /»(-«-«) - • 


Now we distinguish two cases. 

Ist Case. The number of integers i with a] > ma is for some a of order n. 
In this case {/^l} differs arbitrarily little from a sequence of normal distributions 
with mean 0 and the upper limit of the variances at most 1. 

2nd Case. The number of integers i with cr^ > m« is for every a of smaller 
order than n. In this case Xn converges stochastically to 0. In both cases 
(8) holds true since m« can be chosen arbitrarily small. 

We can now prove (5). It follows easily from Tschebycheff's theorem that 
(5) is true if X = i. Hence we may assume X < Let 2 * be defined as at 
the beginning of this section. Form 

^ 2 ( 2 , -- Ejzd) 

o-4f*+i\/(« - 1)(1 - 4X*)’ 

4 _ 2(z,t - Ejz,-,)) _ 2{zi - E(z i)) _ 

V(n - 1)(1 - 4X*) ’ V(n - 1)(1 - 4X») 

where m' == gffc is the largest integer multiple of k which does not exceed (n — 1). 
We form further 


Xn «= X) yy , 2« ~ Z Uf . 

;-l i-l 

Since a% < ^ ^ k{l ^ 4)?) from Lemma 1 that 

2(S - E{S)) 

the distribution of ^ differs arbitrarily little from the distribu- 
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tion of xt for sufficiently large n and k. The second and the third abadate 
moment of \/n — 1 y* are boimded. Hence Vw — 1 fulfills the condi¬ 
tions of Lem 3 ia 2. The application of Lemma 2 yields (5) and conse¬ 
quently 6. 

The integer N{(, H, 1) is independent of < provided the lower limit .of the 
integral does not exceed —1. Hence we have proved 
Theorem. Letk, h, • • • he a ny sequence of numbers satisfying Vie condiHon 

_ J ^ tn "s/W 1 2(w l)Xfl \/3 ^ J ^ 

V(3» - 3)(1 - 4X'*) 

where X' = lim inf X« i/ lim inf X„ < j and 0 < X' < J otherwise. Let Pn{S, H) 

n-*oe n-*oo 

be the power of t he S tes t with reject to the aUemative H and critical region S > 
~2 


(9) 


liminf[p,(^.^)/^/%-*-dx]>l. 


It is worthwhile to remark that (9) is sharp. That is to say there exist alterna¬ 
tives for which the left side of (9) is equal to (1). This is obviously the case 
for any alternative with P{xi > x<+i) = J -f- X and P(r< = 1 and 2 ,+i = 1) = 
P( 2 j = l)-P( 2 i+i = 1). These conditions are, for instance fulfilled by the 
alternative given by P(xi+i = a — 5 — — 5') = ^ h, P(x<+i = C -h 8 + 

28 

• • • + 8‘) = § — X, i = 1,2, • • • where (o — c) > ,-: > 0. 

1 — 0 


If tn = t for every n then (9) implies the consistency of the test if the order 
of \n is larger than Xjyjn- It may also be seen that the test is not consis¬ 


tent with respect to alternatives for which Xn is of order at most equal to ~ 7 =. 

\ n 

This remark refers of course only to alternatives for which Xi is independent of 
Xj for i 5 ^ j. 
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THE ASYMPTOTIC DISTRIBUTION OF RUNS OF CONSECUTIVE 

ELEMENTS 


By Irving ICajplansky 
New York City 


In a permutation of 1, 2, • • • , n let r denote the number of instances in which 
i is next to ^ + 1, i.e., in which either of the successions (i, i + 1) or (i + 1, i) 
occurs. Thus for the permutation 234651, r = 3. In [3] Wolfowitz^ has pro¬ 
posed the use of r for significance tests in the non-parametric case, and in [4] 
he has shown that asymptotically r has the Poisson distribution with mean 
value 2. It is to be noted that W{R), the number of runs as defined by Wolfo- 
witz, is equal to n — r. 

In this note we shall derive more explicit results concerning the asymptotic 
distribution of r. In a random permutation (all permutations being regarded 
as equally probable) let the probability of exactly r successions as above be 
P(n, r), and let il/(n, k) denote the fc-th factorial moment of the distribution, 
that is 

ilf(n, k) = 2rr(r — 1) • • • (r — /? + l)P(n, r). 


We shall show that 


( 1 ) 

(2) 


M{n, k) = 2‘ j^l - 
P(n; r) = ^ [^1 - 


k-\-\(k\k k+ 2 (k\ k(k -• 1) 
2k \l)n 2U- \2jn{n - 1) ‘ 

- 3r r* - 8/ + 9r^ + 22r - 16 ' 
2n 8n(n — 1) 


...] 

+ 0(n-’). 


Since 2* is the fc-th factorial moment of the Poisson distribution with mean 2, 
either of these results serves to verify the asjTnptotic Poisson character of the 
distribution of r. 

It would be possible to obtain some kind of explicit formula for the general 
term of (2), but there seems to be no reasonably simple form. 

Proof of (1). Let Ai denote the event 1 comes right after z” and P, 

the event comes right after z -H 1^' (z = 1, • • • , n — 1). The joint prob¬ 
ability of k of these 2n — 2 events is either 0, if they are incompatible, 
or (n — k)\/n\ if they are compatible—^for in the latter case we in effect assign 
positions for k of the elements and are then free to peianute the n — k others. 
Let /(n, k) denote the number of ways of selecting k compatible events. Then 
it is known that ([1], eq. (40)) 


(3) 


M(n, k) = klfin, k){n - k)\/n\ = /(n, k)/\ 



^ I am indebted to Dr. Wolfowitz for calling my attention to this problem, and to its 
identity with what I called the “n-kings problem*^ in [2]. 
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The relatione of incompatibility can be summarized by the statement that 
Ai is incompatible with if | i — j | g 1. In view of (3), our task th\;w reduces 
to the proof of the following combinatorial lemma. 

Lemma. Suppose 2n — 2 ohiects , • • • , An-i ,* -Bi, • • • , are given. 
Let fitly k) denote the number of ways of selecting k objects with the restriction that 
Ai and Bj must not both be chosen when 11 — j | ;g 1. Then 


(4) 


fin, k) 

2* 


Zi-iy 


t—0 


Jfc +1 
2‘Jb 



Proof. We split the acceptable selections into two subsets: those which 
include A„-i and those which do not. Let the latter be g(n, k) in number. 
Since the selections which include A^-i must omit Bn-i and Bn-t, it is clear that 
they are g(n — 1, k — 1) in number. Thus 


(5) f(n, k) = g(n, k) + g{n — 1, k — 1). 

Similarly we split the selections which omit A^\ according as they omit or 
include B»_i; we obtain 


(6) g{n, k) - /(n -!,&) + g{n — l,k — 1). 
Elimination of g from (5) and (6) yields* 

(7) fin, k) = /(n - 1, k) +/(„ - 1, fc - 1) +/(n - 2, fc - 1). 

We can now make an inductive proof of (4). Assuming (4), we have 

Sin, k) - /(n - 1, A:) _ ^^{k + 1 yk\ {n - i — i\ 

2“ 2*k \i/\k - i - l) 


fin - 2,k - 1) _ k + i - I fk — i\ [fn — i — l\ (n 

’ 2Kk - 1) V i 

-iy - A p + * - 1 A - l\ A: + t-2 

’ \k - i - lj'l2‘ik - 1) \ i /■^2^‘(fc-l) 



In view of the identity 

k + i(k\_k + i-l(k-l\ k + i - 2/k - l\ 

k \i) k - 1 \ i / ik - 1 \f - 1/ 


we now readily verify that the right hand side of (4) satisfies (7). To complete 
the induction we must check the appropriate boundary conditions. According 
to (4) we have 




= 0, 


/(n, 1) ~ 2n — 2, both as they should be. 


* This recursion formula is essentially the same as equation (20) in 12]. 
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Note, There are various other formulas for /(n, k ); we have selected (4) as 
it exhibits the asymptotic behaviour best. In an unpublished investigation 
John Riordan obtained a neat representation as a hypergeometric function: 

/(w, k) = 2(n - k)F{l - A:, 1 + A; - n; 2; 2) 

and derived corresponding recursion formulas. Essentially the same result 
was given by Wolfowitz [3]. Still another formula given by Riordan is 

A symbolic version is given in §5 of [2]. 

Proof of (2). From the formula of Poincar6 ([1], eq. (29)) 

r!P(n, r) = X k)/(k - r)! 

A:—r 

or, in a cabalistic symbolic form, P(n, r) = We substitute the suc¬ 

cessive terms of (1) and we may let the sum run to infinity at a cost of 0(71“"*) 
for any positive m. The first term contributes^ 

E (-l)*+^2"/(A: - r)\ = 2^ Z (-2)Vf! = 

fc-r t-0 


Again since 

k'‘ + k= {k- r){k - r - 1) + (2r + 2)(fc - r) + r* + r, 
the next term yields 

i + fc)2*-V(fc - r)! = 2'e-' ^2 - 2r - 2 + 

and so on in obvious fashion. 

Some indication of the asymptotic behavior of P(w, r) is afforded by the fol¬ 
lowing table for n = 10, It is to be noted that, because of the form of (2), 
the approach to Poisson is much more rapid for r = 0 and 3 than for other r. 


r 

P (10, r) 

Poisson 

First two terms 
of (2) 

0 

.132 

.135 

.135 

1 

.300 

.271 

.298 

2 

.305 

.271 

.298 

3 

.179 

. 180 

.180 

4 

.065 

.090 

.072 

5 

.015 

.036 

.018 

6 

.002 

.012 

.001 

7 

.000 

.003 

-.001 


* My thanks are due to Mr. Riordan for correcting an error in this section, and for many 
helpful suggestions concerning the entire paper. 



DlSTBlBtmON OF RUNS 


203 


REFERENCES 

[Ij M. FrAchbt, “lies probabilit^s associ^es a un syst^me d*4v($iiements compatibles et 
dependants,” ActualiUs Scientifiq^^^ Industriellea^ no. 859, Paris, 1940. 

[2] I. Kaplansky, “Symbolic solution of certain problems in permutations,” Bull, Amer, 

/Soc., Vol. 60, (1944) pp. 906-914. 

[3] J. WoLPOWiTZ, “Additive partition functions and a class of statistical hypotheses,” 

Annals of Math. Stal.^ Vol. 13, (1942) pp. 247-279. 

[4] J. WoLPOwm, “Note on runs of consecutive elements,” Annals of Math. Stat.jVol. 15, 

(1944) pp. 97-98, 



ON THE APPROXIMATE DISTRIBUTION OF RATIOS 

By P. L. Hsu 

National University of Peking 

The purpose of this paper is to apply Cramer^s theorem of asymptotic expan¬ 
sion^ and Berry’s theorem^ to study the approximate distribution of ratios of the 


following two types: 



(I) 

Z = 1(F,+ ... 
n 

+ Yn)/UX,+ ■■■ 
/ tn 

+ -Tm) 

(II) 

Z = Y 

/i(X,+ ••• +X„) 
/ m 

= Y/X. 


In (I) the Xi , Yj are independent, the Y j are equi-distributed,* and the X, are 
equi-distributed and positive. In (II) Xi, • • • , Xn , Y are independent and 
positive, and the X< are equi-distributed. 

1. The ratio (I). Assume that (II) the absolute fcth moment of X< and that 
of 7; are finite and positive, where A; is a fixed integer >3, 

(12) the distribution of X< and that of Yj are non-smgular. 

Let 


{ = €(X,), n = €(Fy), = €(X?) - = €(FJ) - f 


and 


u = - i), F = ^ (f - 

a T 

Let F(x), G{x) and H{x) be respectively the distribution functions of Z, U and 
7. Let 


( 2.^.2 2 \i 

, 

m n f 


_ in - ri 

b ■ 


Then the relation Z < x is equivalent to 


xaU_ 

b\/m 


tF 


= < M. 


by/ n 


‘ H. CramAr. Random Variables and Probability Distributions (1937), Chap. 7. 

* A. C. Berry. “The accuracy of the Gaussian approximation to the sum of independent 
variates”, Trans. Amer. Math. Soc., Vol. 49 (1941), pp. 122-136. 

®The Y,- are said to be equi-distributed if allY/ have the same distribution function. 
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For simplicity we shall assume a; > 0; the results are, however, general. Then 

the distribution functions of — r~ 7 =- and :— 7 = are 

ovw ovn 


Hence, by the theorem of convolution, 

(1) F(x) = - g(^- ~ dH . 

Here we recall the theorems of Cram6r and Berry: Under the conditions (II) 
and (12) 

( 2 ) G(x) = m + E^ + J^,, 

I m tn 




P.w - Z e, 


and I Da I is less than a positive number which depends only on k and the distribu¬ 
tion of . If A; = 3, condition (12) may be removed/ 

Analogously, 


_i_ V' Q^(^) _L 

H(x) ^(x) + 2 ^ , 


Q.{x) = 

In the sequel we shall use the letter A* to denote an unspecified quantity such 
that I Aa I is less than a positive number which depends only on k, the distribu¬ 
tion of Xi and the distribution of Fy. 

Using (2) we have 

(4) 1 - G{-x) = <h(x) + g — 

and this making this substitution in (1) we get 
i?r^\ ^ - y) ^ 




^ This last assertion constitutes Berry’s theorem. 
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and so by partial integration, 
F{z) = 


£“ jj dP, (^-^) 


lyjK*—2) * 


Making the transformation y == axvihy/m and writing 


we get 


F{x) = £" H{du - m'{v) dr + 2 /* //(au - fiy)P:{v) dv + 

-7 I 

For Jo we use (3) and obtain 

h = 4>(au - /jy)$'(t») dy + £ ^ £^ <?»(«« - /8y)4‘'(t') dv + . 

For ly we use (3) with k replaced by A; — Thus 

f “0 fc—8—V 1 f * A 

^ f>(aM - MPU’d) dv + ]C. ^2 £_ - fiv)P'.iv) dv + . 

Combining these results we get 

(6) F{x) = f ^(otu — 0v)^'(v) dv + 2 f Qv{ou — fivW{v) do 

J-go y-l J-PO 

+ 2 f ^(“« - ^v)P',(v) dv 

V.l J-oe 

fc—8 k — 3 — r / 1/** 

+ E Z L . 

v-i u-i m J-o© 


A* 


4>(cm — fiv)Py{v) dv 


_ Afc , Ajfc . 

;yji(^-2) ■+* ^i(fc-2) “f" ^ 


7 yj*'/2 ^i(i:-2~>') 


/ jL J_V“' 

\A/m \/n) 


Now by (5), a > 0 and a — = 1, For such values of a and jS, however, it 

follows easily from the theorem of convolution that 


f' ^{au — fiv)^'{v) dv =» <b'(u). 

J_8C 
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As differentiation under the integration sign is justified by the boundedness of 
the derivatives of ^ we have 


A 00 


Repeated partial integration then gives 


f — /3v)^^^\v)dv ^ ^ f ^\au — fiv)^\v) i 

•Lqo •t-CO 


Hence 


f Qp(au — fiv)^'(v) dv ^ djp f — fiv)^'(v) i 

j—eo jiml »“00 




f ao 9 Mto 

^{au fiv)P[(v) dv = 2 

00 /■■I *^00 


2 


f Qf,(au — 0v)Py{v) dv = ti: 

J-OO «—1 J—I J 




Making all these substitutions in (6) we obtain the final result 


m - *(«)+g i: ♦<■<»-(„) + § „4 S ^ 


1 dj. 


k—dk—d—p ( _ (t p ai>+2j 

+ S iC d,>C> ^+r+2»+2> (u) 

^-1 m n or 

/ 1 1 


If /: = 3, the result remains true without the condition (12). 

2 . The ratio (11). Here we make the follomng assumptions: 

(111) The A:th moment of Xi is finite and positive, where & is a fixed integer 
> k, e(X,) = 1,^ ,(X^) ~ 1 - 

(112) The distribution of is non-singular. 


* As the case c ( Xi ) ■« 0 is excluded, there is no loss of generality in this assumption. 
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Let U = x/wC-S* — l)/(r, and F(x), G{x) and H(x) be respectively the 
distribution functions of Z, U and Y, Then 




Because of the positiveness of Xi and Y we may always assume x > 0. Then, 
by the theorem of convolution, 

~ 

Using (4) we have 

^Vm {x — ^ (—1)” ( Vm(a; — y) 


Fix) = 


ax 


m'l^ 


X 


aX 


)} 


dH{y) +^iu/ 2 ), 


m’ 


where, as throughout the rest of this paper, Ak represents an unspecified quantity 
such that I Aib I is less than a positive number dei)ending only on h, the distribu- 
tion of Xi and the distribution of F. By partial integration we get 




(7) Fix) = fnix -y) ^ 

I \ 0 - 2 : / ,.-.1 . ““^ r /2 

An interesting special case is the following: Suppose that (113) '’(x) 

exists and is continuous for all x > 0; (II4) the functions 

$,(x) = x'H^-\x) 




iv = 1, • • • , /c — 3) 

are bounded, i.e. 

{,(x) = Ak ; 

(II3) there is a positive constant c < 1 such that 

= Ak 

for all X > 0 and (1 — c)x < 2/ < (I + c)x. Under these conditions we have 
(-l)’v'x' 2 ',ff‘"(x) 


4 -^) = g 


y! 


(A: - ^ V \/m,/ 


and so, for \z\ < we have 


( 8 ) 


\ a / to / k-o plm"'^ wb*-2) ■ 
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Separate now the integral in (7) into two parts: 

A = / _, h = I 

•'If] 




Now 


|/s|<f ^ *'(2) + li 

•'|f|>cvm/^ *'■“1 


i-iypUz) ^ 

m'l* 


dz. 


Evidently this last integral is exponially small and so is ^*/jn***“*’. By (8), 

Combining these results we obtain 

- £(§ ^ 

Jfc—8 j y k—Z k—Z M J I. A 

T XT' 


Au 

m*(*~^*> 


_ r 1 V' ^ j 

h> m'« if ^ »«»<'■+>') + m»»-« 

= z + 2: + 


-4* 




where 


= £ z"^^<'\z)dz. 


Now the following facts can easily be established by means of partial integration: 

(9) Ia$ = 0 when a — /? is even, 

(10) la$ = 0 when . jS — a > 1. 

By (9), the non-vanishing terms in are the even terms and the non-vanishing 

1 

terms in those for which n + vis even. Hence 

2 


E = 

i 

Z = 


lJ(fc~3)l 

E 


g» fe y 


[i(fc-8)l IJ(A;-3)1 2 m . > 

F-0 


/2f, 


2M+2y+l 



g;>> fe f+1 r 

^M+r+I 
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Using (10) to reduce 2 further we get 
2 


E I*(^)l A g > CK^Ol i-l 2^1 ; t 

V' V' V' r _j_ V' r 

- Li i2F.2M4-2/+l + 2^ 2-r 2^ " l,Vi i2y-hl,2»4-2/+2 

2 W-.2 M-1 J-l »-l M-0 j-1 


^ y' Qnr I V 

^ ;^:o ^ 


= £ 53 ^«|8i24+4+ 2 ~^2 53 

[J (*-«)] 1 t-3 [4(fc~8)] . i-2 j 

” 2 £ ^vfe;>4 + 53 

SS m';-[j(?-2)] " m* t»tCA:-2) 


,-2 m* y-[i(t-l)l 


Hence 


£ + £ - i. ++"'g*' i 


( r -8 >»-2 ^ \ 

ep^2p + + 51 iMKf2»+s)+ 

#*-[4(k-2)1 M-U(*'-2)1 / 


fe-3 1 2v j 

fo + 53 53 ?;>$; + -"j'(ir23 • 

" y-r+i m*'* 


Hence 


F{x) = f 0 + 53 53 PjKfj + ~^, 


Our final conclusion is: Under the conditions (II1)-(II5) formula (11) is true; 
if = 3, (11) remains true without the condition (II2). 



ON THE DISTRIBUTION OF THE SERIAL CORRELATION COEFFICIENT 

By Herman Rubin 

Cowles Commission for Research in Economics, University of Chicago 

The distribution of the serial correlation coefficient, in samples drawn from 
a parent distribution with zero serial correlation, has been studied by many 
authors. Anderson [1] obtained the exact distribution. Dixon [3] and Koop- 
mans [4] have given approximate distributions, each attained by smoothing the 
characteristic values of the numerator of f in (1) below. Dixon smoothed the 
cliaracteristic values in the generating function and obtained his results by 
comparing the moments of the exact distribution with those of the approxima¬ 
tion, of which the first T are found to be exact. Koopmans smoothed the 
characteristic values in the exact distribution function. Here we evaluate 
Koopmans* result and show that it is the same as Dixon\s approximation. It 
thus appears that in this case it is immaterial whether the characteristic values 
are smoothed before or after inverting the characteristic function. We also 
add Tables comparing confidence limits for the exact distribution, for the ap¬ 
proximation referred to, and for a normal approximation. 

We define the serial correlation coefficient as 

r 

72 x,x,+i 

(1) r = -, *,•+, = Xi. 

72 

Then Koopmans obtains, if the true value p of r equals 0, and the Xi are nor¬ 
mally and indeix'iidently distrilmted with mean 0 and variance a', the ap¬ 
proximate distribution T/2 — 2. 

rtlr/i/yi _ ^aro COB f 

(2) 7i(f, T) = — - ^ / (cos a — sin ^Ta sin a da, 

TT Jo 

Although in the distribution problem T is a positive integer, it is useful to 
consider the right-hand mem])er of (2) as the definition of K(f-7') for those 
complex values of T for which it exists. 

Let /?(7') denote the real part of T. If R(T) > 2N + 2, we obtain 

f - Ji(f, T) = (1J, _ 2)(jr - 3) ... (iT - N - 1) 

(cos a — sin iTa sin a da . 

Now, according to [2], tables 41, 42. 

I (cos sin ^Ta sin a da 

JTx r(iT - N - 1) 

= 2 ’^-^ r(i(T - AH- l))r(i(l - N)) * 
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Deonote by ^*"'(0, T) the value of ^ Ji(f, T) for f = 0. Then for R{T) > 
2i\r + 2, 


(5) 


(0, T) 


(_i)V2Ar(jr + 1) 
mr - N + i))r(i(i - N)) • 


R(f, T) is analytic in f for | f | < 1, R{T) > 2, and is analytic in Tfor | f | < 1, 
R{T) > 2. It follows by Hartogs’s theorem [5] that Ji{f, T) is analytic in f 
and T for | f | <1, R{T) > 2. By analytic continuation we get that (5) holds 
for R{T) > 2. Consequently 


(6) If N is odd, T) = 0; 

(7) if N is even, 


r’(0, T)_ 2''’r(J7’+ i)r(i) 

m T) r(i(T - N + i))r(i(i - N)) • 


liCt N = 2P, then 

1 (0, T) 


T - 2P + 


( 8 ) 


(2P)! UO, T) 

_ (T - i\ /r - 3\ / 

(2P)! “ V 2 / A 2 / A 

_ {-If (T - iVr - 3\ (T - 2P + l\ (2P)! 
(2P)! V”2 A 2 / A 2 / PI 




3---(2P - 1)^ 


i r — 

P\ L{d(^)I' 


(1 - 




”1 ■ 

Jf^O 


2^ 


According to (5) 

(9) 1{Q, T) 


r(JT +1)_ 

mr + i)ix'^) A. (1 - r') 


Hence 

( 10 ) 


Uf, T) = 


r(ir + i)(i - f')*'’’-” 
r(i T + i)r(i) 


df * 


which is the same as Dixon\s expression (3.22). 

A more elementary proof by complete induction for integral values of T can 
be based on the recurrent differential equation (14) which is of interest in itself. 
To this end we shall write (2) in a different form which is easily obtained through 
partial integration. 

oJ ^.177 Pare COB f ^ 

(11) Ji(f, T) = ^ / (cos a - f)**^-* cos ^7’ a da. 

TT Jo 
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Differentiating with respect to f, 


l'{f, T) = (cos a - f)*’’-* cos i Tadct. 

= r “• \eos a - (cos KT - 2 ) « cos« 


sin i(T — 2) a sin a) da 


XT(1.T — 1^0*^ ^»rc COB r 

-1 — t — - (cos a — cos \{T — 2)a da 

tT(tT /»*re COB f 

(il 1)^ f ^ ^ 

V Jo 

iW - 1)2*^ f‘" 


iT f»u,Te COB f 


Marc COB T 

I (cos a — f)^ 
Jo 


•sin i(T — 2)a sin a da 


iT(iT - 1)2'^f 


iT - />%ro COB f 


^aro COB r 

/ (cos a — f)^ 

Jo 


•cos KT — 2)a da, 

because the first and third terms in (12) cancel as may be shown by integrating 
by parts. 

Hence (13) reduces to the recurrent differential equation 
(14) ^'(f, T) = -2-§7V^(f, T - 2). 


Let us now assume that 


S„, r - 2, . 


Then (14) becomes 


rr\ ^ \rp\0' — 1) r(jr) s?\\T-Z 

Mr. T) - - 2 r;T > 

(16) 

_ OR i/rr L(§T -|- 1) -2>^4(r'-i)—i 

- -2rJ(J'-l)pg5,-pT^j(l-r) 

Integrating, one obtains 
(17) Mr, T) = f(|f-:j.'^jr(4) 

No constant of integration occurs because (17) agrees with (5) for f = 0 and 

N = 0. 
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It remains to prove the validity of (17) for the initial values T = 3 and T =» 4. 
If r = 4 


(18) 


Mr, 4 ) = - / 

IT Jo 


sin 2a sin a da 


8.3 f' 

= 3;®“ “i. 


arc cos f q 


r(3) 

r(|)r(i) 


(1 - 


For r = 3, 
(19) 




sin sin a 


a — f 


da. 


( 20 ) 


Substitute cos a = f+ (l“f) sin^ 6. We get 

K(f, 3) = - ^ f {(1 + 2f) cos^ 0 + 2(1 — f) sin^ 0 cos* 0} dd 

TT Jo 

r(f) 


= i(i - n = 


r( 2 )r(i) 


(1 




which completes the proof. 

A short table of confidence limits is included, corresponding to the 5% and 
1% significance levels, comparing the exact distribution given by Anderson [1] 
(the values in parentheses being graphically interpolated by him), the distribu¬ 
tion (10), and the normal curve with the same mean and standard deviation. 


Confidence limits for f 


T 

5% 

1% 

Exact 

(10) 

Normal 

Exact 

(10) 

Normal 

3 

.864 

.729 

.736 

.970 

.882 

1.040 

4 

.713 

.069 

.072 

.898 

.833 

.950. 

5 

.622 

.621 

.622 

.823 

.789 

.879 

6 

.570 

.582 

.582 

.702 

.750 

.823 

7 

.545 

.549 

.548 

.714 

.715 

.775 

8 

(.521) 

.521 

.520 

(.082) 

.685 

.736 

9 

.498 

.497 

.496 

.656 

.658 

.701 

10 

(.477) 

.476 

.475 

(.033) 

.634 

.672 

11 

.457 

.458 

.456 

.012 

.612 

.645 

15 

.400 

.400 

.399 

.543 

.543 

.564 

20 

(.361) 

.352 

.351 

(.480) 

.482 

.496 

25 

.317 

.317 

.317 

.437 

.437 

.448 

30 

(.291) 

.291 

.291 

(.404) 

.403 

.411 

35 

(.271) 

.271 

.270 

(.377) 

.376 

.382 

40 

(.255) 

.254 

.254 

(.355) 

.354 

.359 

45 

.240 

.240 

.240 

.335 

.335 

.339 
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It is thus seen that the distribution (10) provides satisfactory significance levels 
for r > 9 whereas the normal approximation provides satisfactory 5% signif¬ 
icance levels for the same range. The normal approximation appears to be 
unsatisfactory, however, at the 1% significance level even for T as high as 45. 
The normal approximation here i^ed is not the same as that used by Anderson 

v/ y r 

([1], p. 53), which assumes ■ to be normally distributed. 

V 1 + 2f2 


The following table shows a comparison between a few more confidence limits 
of the Type II curve (10) and the normal curve with same first two moments 
for a few values of T. 


Confidence limits for f 


T 

5% 

4% 

3% 

2% 

1% 


(10) 

Normal 

(10) 

Normal 

(10) 

Normal 

(10) 

Normal 

(10) 

Normal 

15 

.400 

.399 

.423 

.425 

.452 

.456 

.488 

.498 

.543 

.564 

20 

.352 

.351 

.373 

.373 

.398 

.401 

.431 

.438 

.482 

.496 

25 

.317 

.317 

.336 

.337 

.360 

.362 

.390 

.395 

.437 

.448 
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NOTES 

This section is devoted to brief research and expository articles^ notes on methodology 
and other short items. 


A NOTE CONCERNING HOTELLING^S METHOD OF INVERTING A 
PARTITIONED MATRIX 

By F. V. Waugh 

War Food Administration^ Washington 

Professor Hotelling recently presented several methods of computing the 
inverse of a matrix.^ Among these was a method of partitioning a square matrix 
of 2p rows into four square matrices, a, 6, c and d, of p rows each, resulting in 
the partitioned matrix, 

[: 3 - 

The inverse of this matrix can also be written as a partitioned matrix. 



Then, multiplying the original matrix by its inverse we get four matrix equa¬ 
tions, 

aA “f* hB — 1 a(J -f- hD = 0 

cA “I-- dB = 0 cC + dD = 1. 

These equations can be solved for A, B, C, and D, 

Professor Hotelling^s solution requires the inversion of four p-rowed matrices. 
It is possible, however, to solve these equations by formulas involving only two 
inversions. The formulas are 

D = (d ~ caT^r' B = -Dca-' 

C = -a~^bD A = - a~^hB. 

As an example of the procedure let the given matrix be 


26 

-10 

15 

32 

19 

45 

-14 

-8 

-12 

16 

27 

13 

32 

29 

-35 

28, 


^ Harold Hotelling. “Some new methods of matrix calculation,” Annals of Math. 
Stat., Vol. 14 (1943), pp. 1-34. 
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The necessary steps in computation are 


-1 r .03309 .00735] a~ 

‘f» = r .39345 

1.00008] 

i-.01397 .01912] 

[-.47723 - 

-.6OOOO] 

-1 r-.62060 .21772] ca 

“ L -65375 .78968] 

■‘6 = r -12.35708 

-21.60096' 

[ -1.24927 

14.60256. 

Note that a convenient check at this point is to compute both 

(ca )b and cioT^b) 



, - 1 . r 39.35708 34.60096' 

^ [-33.75073 13.39744_ 


_i,,_i [-00790 

(d-cai) = fl = 

-.02041] 

. 02322 ] 


. [-.02302 

-a I.C - C = [_ 

-.015191 

.00419J 


_ -1 P f .01825 
Dca ®-[_ 00282 

.01440] 

-.02267] 


-1 -i,B 4 r .02873 

.024361 

.01239J’ 



The last four of these matrices are the four parts of the inverse, which can be 
written 


.02873 

.02436 

-.02302 

-.01519 

-.00696 

.01239 

.01572 

.00419 

.01825 

.01440 

.00790 

-.02041 

-.00282 

-.02267 

.01991 

.02322 


The accuracy of the computations can be checked by multiplsdng the original 
matrix by the computed imerse matrix. The product should, of course, l)e a 
close approximation of the identity matrix. If further accuracy is called for 
we can use Hotelling’s iterative formula. 

Cl = Co(2 - AC,) 

where Co is the estimated inverse; A is the original matrix; and Ci is a second ap> 
proximation of the inverse. 



NEWS AND NOTICES 

Readers are invited to submit to the Secretary of the Institute news items of interest 

Personal Items 

Professor W. G. Cochran of Iowa State College has gone overseas as a con¬ 
sultant for the United States War Department. 

Professor A. R. Crathome of the University of Illinois has retired with the 
title of Professor Emeritus. 

Professor William Feller of Brown University has been appointed Professor 
of Mathematics at Cornell University, Ithaca, New York, as of July 1, 1945. 

Associate Professor Joe J. Livers has returned to Montana State College at 
Bozeman after receiving his doctorate in February at the University of Michigan. 

Assistant Professor W. A. Vezeau of the University of Detroit has been ap¬ 
pointed Assistant Professor of Mathematics at St. Louis University. 

Associate Professor S. S. Wilks of Princeton University has been promoted to 
a professorship. 

The American Statistical Association elected ten Fellows during 1944. Of 
these ten, five are members of the Institute. They are A. E. Brandt, W. G. 
Cochran, Gertrude M. Cox, Alan Treloar, and Sewall Wright. The President 
of the Association is Dr. Walter A. Shewhart, a charter member of the Institute 
and its President during 1944. 


New Members 

The following persons have been elected to membership in the Institute: 

Allendoerfer, Asso. Prof. Carl B. Ph.D. (Princeton) Ilaverford College, Haverford, Pa. 
Beckstead, Lt. (j.g.) Gordon L. M.S. (Michigan) Aerologist,U.S. Navy. Aerology, Navy 
^151, c/o Fleet Post Office, San Francisco, Calif. 

Berman, Abraham J, M.A. (Brooklyn) Statistician. 14OO College Avenue, New York, 
N. Y, 

Bigelow, Julian H. Asso. Director, Statistical Research Group, Columbia University. 
401 West 118th St., New York 27, N. Y. 

Bowen, Earl K. A.M. (Boston) Instr. Math. Northeastern Univ., Boston, Mass. On 
military leave—Scientific Consultant, Office of Field Service, O.S.R.D. 6 Sibley Ave., 
W, Springfield, Mass. 

Canter, Stanley D. B.S. (Coll. City of N. Y.) Statistician, Lerner Shops, Inc., New York, 
N. Y. 2676 Morris Ave., The Bronx, 58, New York, N. Y. 

Cohen, Karl. Ph.D. (Columbia) Physicist, Standard Oil Development Co. Ksso Labora¬ 
tories, Research Division, P. O. Box 243, Elizabeth B, N. J. 

Cooper, William W. A.B. (Chicago) Instr. in Economics, University of Chicago. 6539 S. 
Ellis Ave., Chicago 37, Ill. 

Davidson, James H. B.S. (Norwich Univ.) Research Physicist, Hercules Powder Co. 
Box $44, Christiansburg, Va. 

Epstein, Benjamin Ph.D. (Illinois) Staff Assistant, Westinghouse Electric & Mfg. Co., 
Quality Control Dept., Rm. 3-A-17, East Pittsburgh, Pa. 
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Gauthier, Prof. Abel A.M. (Columbia) Prof, of Mathematics, University de Montresd, 
2900 Mount Royal Blvd., Montreal, Canada. 

Geraten, Lydia Blumenthal BA. (Hunter) Res. Stat. 1001 Lincoln Place, Brooklyn IS, 
N. Y. 

Goifman, Casper Ph.D. (Ohio State) Staff Asst., Quality Control Dept., Westinghouse 
Elec. & Mfg. Co., Rm. 3-A-17, East Pittsburgh, Pa. 

Hastay, MUlard W. B.A. (Reed) Asso. Math., Stat. Res. Group, Columbia University. 
401 West 118th St., New York 27, N. Y. 

Houseman, Earl £. M.A. (South Dakota) Head Sampling Sec., Stat., Division of Program 
Surveys, Bur. of Agric. Econ., Washington 26, D. C. 

James, R. W. M.A. (Toronto) Asst, to Director, Washington Div., Wartime Prices & 
Trade Board. Room 3068, Railroad Retirement Bldg., Washington, D. C. 

Jones, Robert Richard, Jr. A.B. (Columbia) 61 Jackson St., New Rochelle, N. Y. 

Kac, Asst. Prof. Mark Ph.D, (John Casimir Univ., Lwow) Math. Dept., Whitehall, 
Cornell University, Ithaca, N. Y. 

Knoepfel, Margaret F. A.B. (Brooklyn) Jr. Stat., Weather Bureau, Washington, D, C. 
3S06 Ely Place, S.E., Washington 19, D. C. 

Ladd, Robert Boyd M.A. (Texas Coll, of Arts & Industries) Stat. Consultant, OCT, 
Transport Economics, Traffic Control Div., War Dept., Washington, D. C. BOS Wade 
Ave., Rockville, Md. 

Larson, Charles M. B.Sc. (Nebraska) Stat. Analyst, Northrop Aircraft, Inc. 51^. West 
125th St., Hawthorne, Calif. 

Lesansky, William A. B.B.A. (City Coll, of N. Y.) Stat., War Dept., Washington, D. C. 
1841 Summit Place, N.W., Washington 9, D. C. 

Lewis, Wyatt H. B.S. (Calif. Inst, of Tech.) Quality Control Engineer. 212 East H 
Street, Ontario, Calif. 

Mathisen, Ensign Harold C. A.B. (Princeton) Ensign, USNR. 59 Fernwood Road, East 
Orange., N. J. 

Miller, Robert Carmi Res. Engineer, Elgin National Watch Co., Elgin, Ill. 

Mittra, Probodh Chandra B.Sc. (India) Grad. Student in Math. Stat., Columbia Uni¬ 
versity, New York 27, N. Y. 

Neumann, Prof. John von Ph.D. (Budapest) Institute for Advanced Study, Princeton, 
N. J. 

Noland, Asst. Prof. E. William Ph.D. (Cornell) Dept, of Sociology & Anthropology, 
IMcCraw Hall, (.\)rnell University, Ithaca, N. Y. 

Okun, Yetta Edith B.x\. (Hunter) Res. Asst., Dept, of Tjalwr, Washington, D. C. 2120 
Will St., A'.U'., Washington 9, 1). C. 

Owen, F. V. Ph.D. (Wisconsin) Geneticist, U. 8. Dept, of x\gric. 1810 S. Main St., Sail 
Lake U-///, Vtah. 

Poston, Paul Lehman B.S. ((California) Statistician. George Washington Carver Hall, 
211 Elm St., Washington D. C. 

Rice, William B. .\.B. (Davidson) Director, Dept, of Stat. & Reports, Plomb Tool Go. 
906 Baldwin El Mojite, Calif. 

Rudnicki, Alex. B.S. (City (’oil. of N. Y.) Grad. Student in Math. Stat. 1072 Lorimer 
St., Brooklim 22, N. Y. 

Rupp, William B. Mgr., Qmiliiy Control Dept., RCA Victor Div., Radio Corp. of America, 
Harrison, N. J. 29 Dodd Si., East Orange, N. ./. 

Savage, Leonard J. Ph.D. (Michigan) Res. Math., Stat. Rea. Group, Columbia T7ni- 
versity, 401 West 118th St., New York 27, N. Y". 

Sheppard, David B.S. (Yale) Statistician, Army Air Forces. 2721 Terrace Road, S.E., 
Washington 20, D. C. 

Smith, Prof. James Gerald Ph.D. (Princeton) Prof, of Economics, Princeton University. 
80 Murray Place, Princeton, N. J. 
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Stifler, Prof. George J. Ph.D. (Chicago) Prof, of EconomicSi Member, Res. Staff, Na¬ 
tional Bureau of Eoon. Res., University of Minnesota, Minneapolis, Mina. 

Weingarteni Harry M.A. (Columbia) Math. Teacher, School of Aviation Trades. ISSO 
Morris Ave., Bronx 50, N. Y. 

Weinstein, Joseph M.S. (C.C. N. Y.) Res. Analyst, Vacuum Tube Tests A Standardiza¬ 
tion, Camp Evans Signal Lab. Signal Corps. IS Washington Village^ Asbury Park^ N, J, 

Westman, A. E. R. Ph.D. (Toronto) Dir. of Chem. Res., Ontario Research Foundation, 
43 Queen’s Park, Toronto 5, Canada. 

Wilcox, Sidney W. L.B. (California) (^hief Stat., Bur. of Labor Stat. Room 2318, Dept, 
of Labor, Washington 25, D. C. 

Young, Captain Chen-Pang B.A. (National Tsing Hua Univ., China) Ordnance Dept., 
Chinese Army. 2S11 Massachusetts Ave,, V. W., Washington 0, D. C. 

Corrections to the Directory Published in the December 1944 Issue 

The name of Dr. Walter Schilling was omitted from the Directory. It should 

have appeared as follows: 

Schilling, Walter M.D. (Harvard) Asst. Clinical Professor of Medicine, 

Stanford University Hospital, San Francisco 15, California. 

The name of Professor Godfrey H. Thomson, Director of the Training of 

Teachers, University of Edinburgh, Edinburgh, Scotland, was mLss|)elled. 



CHOICE OF ONE AMONG SEVERAL STATISTICAL HYPOTHEt^SS 

By Ralph J. Brooknbr^ 

New York City 

1. Introductioii. Statistical decision is a term which we mil apply to t|iat 
phase of statistical inference which deals with the following question. Con* 
sider one or several variates whose distribution function depends on one or 
several unknown parameters; suppose there be given a finite number of mutually 
exclusive hypotheses regarding the parameters, whose totality completely ex¬ 
hausts every possibility. If a sample of observations on the variates is made, 
the choice of one of the given hypotheses on the basis of that sample is called a 
statistical decision. In other words, to make a statistical decision is to give a 
procedure which will divide the sample space into as many regions as there are 
given hypotheses, and to set up a one-to-one correspondence between these 
regions and the hypotheses so that if the sample point lies in any particular 
region, the corresponding hypothesis is chosen. 

This notion is quite closely connected with both of the fields of statistical 
inferen(;e that have engaged most of the modem statistical theorists. On the one 
hand, it may be considered a generalization of the notion of testing hypotheses, 
for in this theory, one gives a procedure which divides the sample space into a 
region of rejection and a region of non-rejection of a given null hypothesis. 
Then one makes either of two decisions depending upon which of the regions 
(contains the sample point. On the other hand, the theory of estimation is a 
generalization of the notion of statistical decision in which the number of alterna¬ 
tives is not restricted to l>e finite 

As in any phase of statistical inference, our primary aim is to define broad 
principles upon which “good’’ or “best” procedures for making statistical deci¬ 
sions may be based. The general problem of statistical decisions has been formu¬ 
lated by A. Wald, who has also proposed a principle on which the solution can 
be based. We are interested, however, in several of the simpler but important 
particular problems in which quite serious calculation difficulties are encountered 
in actually finding Wald’s, solution. Hence, we will projmse in its stead another 
principle which quite closely resembles Wald’s for selecting a solution of the 
problem of statistical decision. 

It may be pointed out immediately that, from a purely logical point of view, 
the substitute principle we shall offer will probably be considered to be less 
acceptable than its predecessor. We will find, however, by considering its 
application to some of the well known problems of testing hypotheses, that the 
principle is at least reasonable in leading to certain well accepted results. 

> Research under a grant-in-aid of the Carnegie Corporation of New York. 

m 
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2 . Principle determining the ^'besf ’ procedure. We will first discuss briefly 
Wald's principle and the definition of the criterion that we will employ will be 
accomplished by pointing out the differences. A much more general formula¬ 
tion is possible [ 1 ], [ 2 ], but we will discuss the principle as it will be directly 
applied to the problems of statistical decisions when the number of hypotheses 
is finite 

Consider the variates , X2, • * • , 0?^ whose probability density function 
/(xi, 0^2, • • • , I , ^2, • • * , fo) is known except for the unknown values of 
the parameters , 62, •••, 6k • We denote by 0 a point in /c-dimensional 
space whose coordinates are (^1, ^2, * • * , 6k) and shall speak of this parameter 
space as S 2 . Suppose that w is any subset of Q and that S represents a system 
of finitely many such sets which are mutually disjunct and which cover 12 . 
Each element, wo , of iS corresponds to a hypothesis , which is the hypothesis 
that ^ is a point of coo, and the system of all such hypotheses corresponding to S 
we denote by Ha . 

A sample of N observations on , ^2, • • • , Xp is drawn and the sample may be 
considered as a point, E, in the pN dimensional sample space; denote the sample 
space by M. We want to decide on the basis of the point E which of the hy¬ 
potheses of Ha should be accepted. That is, we seek a procedure by which the 
sample space may be divided into a system of mutually exclusive regions 
which are the same in number as the number of elements of 5 , and by which a 
correspondence is set up so that the falling of the sample point into a particular 
ilfwj shall cause us to accept a particular hypothesis Hu>^ as the true one. If 
the totality of regions be denoted Ms , it is necessary to give a principle by 
which we may prefer a particular system Ma over any other system Ma • 

Wald introduces the notion of a weight function of errors, a function of the 
parameters and of the decision made, which might well be defined as the loss 
incurred if ^ be the true parameter point and the sample point falls in ilf „ which 
causes us to accept the hypothesis H„ . Denote the weight function by W(0, wjp) 
where stands for that hypothesis which we choose if A’ is the sample point; 
then we require that W{6y be non-negative, and if 6 lies in coj?, oj*) =0 
for then the correct decision has been made and there is no loss. 

Perhaps the notion of a weight function can be most clearly understood, and 
its importance appreciated, if we consider the place of statistics in the business 
world, where possible losses are often computable in terms of money. The 
weight function may be taken to be equal to this loss. Suppose a manufacturing 
plant has a process which manufactures a product who^ efficiency is a measurable 
quantity that we will denote by x. Suppose a: is a random variable whose distri¬ 
bution depends only upon its mean value 6, and the company contemplates 
renewing its machinery if the mean value of the efficiency falls short significantly 
from a particular value 60 . Then on the basis of a sample of N observations on 
X, one of two decisions must be reached: the rejection of the hypothesis 6 6q 
(the decision to renew the machinery), or the non-rejection of 6 ^ 6 q (the decision 
not to renew it). Suppose the region ilf«is the region of the sample space such 
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that if E falls into Mu , we reject $ ^ $o and M^t is the complementary region. 
Then we may say that the weight function can be defined by 

WiSy w) = 0 icre^ $0 

W($y <a) = g{6) for 6 < $o 

wiSy w) = 0 for ^ < ^0 

W($y w) = h($) for ^ ^ ^0 

where h(d) is the company’s monetary loss in needlessly changing its machinery 
and g{d) is a function which expresses the company’s loss in not changing its 
process even though the true value of the parameter is ^ . The function 

g(6) may be of almost any form, but it is only reasonable that it should be a 
monotonic non-decreasing function of |^o — B\y since the loss should, it seems, 
increase as the true value of $ is farther from . 

Wald then defines the risk as the expected value of the loss; since 9 is an un¬ 
known, the risk will be a function of 6, and it will also be a function of the system 
Ms : 

Tidy Ms) - f W(ey cs)^f(E I e) dE. 

According to Wald, the *^best” system of regions, Ms , is that system for which 
the maximum of the risk function with respect to the parameter ^ is a minimum 
with respect to all possible systems, Ms , of regions. Several important proper¬ 
ties are enjoyed by the system of regions defined in this way, though other 
reasonable definitions are possible. Perhaps the criterion of minimisdng an 
average with respect to B of r{By Ms) rather than the maximum may be con¬ 
sidered more plausible, but such definitions would raise the question of which 
average should be used, and the result obtained by using any particular average 
would not be invariant vith respect to transformations of the parameter space. 

Using the notations as introduced above, and introducing the notation W(By on) 
to be the weight function if the tth h 3 q)othesis is chosen, the principle which 
we will use to solve some of the problems of statistical decisions can be given as 
follows: In place of the risk function, we consider the s fimctions 

R,(B, E) - WiSy o^i)^S{E\e) (i - 1, 2, , s) 

where i{E | ^) is a notation for the probability density, and s is the number of 
given hypotheses. If we denote by Ri{E) the least upper bound of B<(^, E) 
with respect to 0, then we choose the system of ‘‘best” regions of acceptance by 
including each sample point E in a region Mi determined such that for all Eo in 
Miy Ri{Eo) ^ Rj{Eo)forallj i. 

It is interesting to note that a rather general case exists in which the principle 
is exactly equivalent with the test of a hypothesis based upon the likelihood 
ratio principle. Consider the distribution function f{xi , y • • • , Xp\B\y , 
• ’ • , Bh) which is a bounded function of the x’s and ^’s. Suppose we are in¬ 
terested in the test of the hypothesis {Biy ^ 2 , * • * , where « is a closed 
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set of points of the parameter space which does not contain any open subset of 
the parameter space. Furthermore assume that for each set of the distribu¬ 
tion function is continuous in , • • • , on an open subset of Q containing 

We will show that the principle will lead to the test based on the likelihood 
ratio if the following is the weight function: 

I. If CO is accepted, the loss is zero if the true parameter point is in w, and the 
loss is a constant Ci if the true parameter point is not in co. 

II. If <0 is rejected (i.e. is chosen), the loss is zero if the true parameter 
point is in CO and is a constant if the true parameter point is in co. 

Consider then the region of the sample space for which co is rejected according 
t/O the principle. This region is that for which 

l.u.b. w.r.t. ^ in CO of [(^{x | ^)] < l.u.b. w.r.t. 0 in w of [cij{x | ^)] 

where we have set/(a: | 6) = f{xi , a:*, • • • , | , • • • , and where l.u.b. 

w.r.t. means ‘‘least upper bound with respect to.^’ But the left-hand member 
of this inequality is equal to 

C 2 fl.u.b. w.r.t. 0 in CO of }{x | 0)] 

and because of the restriction on co and the continuity of /, we can see that the 
l.u.b. of f{x 1 6) with respect to all ^ in w must coincide with the l.u.b. of the 
function with respect to all d in ft, which is the total parameter space. Thus 
we have that the hypothesis co is rejected when 

C 2 [l.u.b. w.r.t. ^ in <0 of/(x I ^)] < CiP.u.b. w.r.t. ^in ft of/(x | ^)] 

or when 

l.u.b. w.r.t. g in CO of f(x | 0) ^ ^ 
l.u.b. w.r.t. ^ in ft of /(x j 0) ’ 

The left hand member of this inequality is the likelihood ratio statistic intro¬ 
duced by Neyman and Pearaon [3] ; hence our test is exactly equivalent with the 
likelihood ratio test where the size of the critical region is determined by Ci 
and C 2 . 

We pose the following quite hypothetical example to show circumstances 
under which the principle proposed is reasonable. The principle does not 
exactly apply as it was stated in terms of probability densities and the example 
involves discrete probabilities, but the logic seems somewhat applicable. Sup¬ 
pose a game is played which consists of the player^s guessing the number of white 
balls in an urn known to contain 10 balls, each of which is either white or black, 
on the basis of a sample of four drawings with replacements from the um. Let 
us assume that there are eleven mutually exclusive hypotheses (as to the number 
of white balls in the um) to choose among, and the player must make a choice 
of one of them after observing the drawing w'hich can give 16 different results. 
Assume that the one who plays the game pays a banker a varying sum of money 
if he makes a wrong decision and that the banker has the privilege of choosing 
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the population (i.e. the number of white and black balls originally m the um). 
Now on the basis of the assumption that the banker knows the player’s decisicm 
function and will attempt to fix the population so as to make the player’s ex¬ 
pected loss a maximum, it is clear tl^t Wald’s principle, which minimizes the 
maximum loss, leads to the best way to play the game. 

Now suppose that instead of one player making the choice among the deci¬ 
sions, we have 16 players participating in the game and the first player is to 
make the choice if, and only if, the drawing is WWWW, the second player if the 
drawing is WWWBy and so on, where W stands for the drawing of a white ball 
and B for the drawing of a black one. In this case, if player x assumes that the 
hanker will try to choose* the population most unfavorable to him, then his 
decision function based on the new principle is the best method of play. 

Although the example indicates that in the usual case which would come up 
in practice, Wald’s principle would lead to the better procedure, since the 
statistician is usually faced with the necessity of giving a decision no matter 
\vhat the sample point is, the new principle is useful since one may hope that in 
many practical cases the two principles m\\ not lead to widely var3nng results, 
(‘Specially if the sample is large. 

3. Application of the criterion to the case of testing the mean of a normal 
distribution. Now we will show that the criterion will lead to the widely used 
test of ^‘Student’s hypothesis.” Suppose x is known to be distributed normally 
with unknown mean m and unknown variance On the basis of a sample of N 
independent (observations Xi, X2 y • • • y Xs , “Student’s is used to test the hy¬ 
pothesis n = 0. If is th(‘. arithmetic mean of the N observations and s* the 
usual sample estimates of the variance, then with t — y/W ;f/s, the hypothesis 
is to be rejected if 1 /1 ^ /o \vhere U is a critical value at some chosen level of 
significance a obtained from the distribution of i under the null hypothesis. We 
will use the notation u)\ for the set of points fx ^ i) and 0)2 for the set of points 

= 0 . 

We will consider the pioblem in referencx? to the particular weight function 
defined as follows: 

WifjLy (T; <^ 2 ) == (m/ fi>r M 3^ 0 

1 F( 0 , a; (oi) - W 

W(^t, ff’yWi) == 0 for 9^ 0 

TF(0, (t; u^) = 0 

where as a matter of convenieiure, we will take k an even positive integer in order 
to avoid the introduction of the absolute value of fi/tr which is necessary if k 
is an odd integer. We also take k ^ N. 

Ihe density function of the sample of N observations is 

C 

N 


-(l/24r*)5(»a-#»)* 
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where C is a constant. Then the two functions E) are 

•2 if M = 0 
Ri{6^ j&) « 0 if /i 0 

if M 0. 


«i(», E) .e-a/*»»)a«i 


^ =■ ^ if ^ 0 


R^(fiy E) =» 0 
To maximize -Ri(^, E), we set 


da L 






t-V+8 


] 


Cc" 


which gives 


hence 


Sxl 

N 




To maximize Rt(0, E), we set 


dBi 




-(l/ 2 <r 2 ) 5 (»a~A»)* 


and 


dcr 


^ = r_Ar-ik + ^i5^1 

r L J 


Six. - ,i)n Cm* --(l/2<r2)5(xa-“i*)* 


jj.Ar+fc +1 


whicii give the two relations 


and 


Then 


or 


T* = Six. - m) 


—/t»(j!V + k)Six. — #i) = fcS(x« — m)* 


/i‘ - M«(l - k/N) - = 0 

which gives the maximizing value of 

*_«(!- k/N) ± \/«®(l - k/N)<‘ + (4Jfc/iV»)Sx* 
" 2 
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and it can easily be shown that the ma^dmum is reached for the value of p,* 
using the + sign when x is positive and the — sign when £ is negative. We will 
carry through the case x > 0 only as the case x < 0 follows in a similar manner. 
We have 

Rt(.E) - -e 

To find the region of the sample space for which we should accept the hy¬ 
pothesis M 5^ 0 (i.e. the critical region for rejection of the hypothesis m = 0 )^ we 
seek those points E for which Ri(E) g RziE), i.e. those for which 

or for which 

_ ^*)]Us+k) 


where c is a positive constant. Since both sides of the inequality are positive, 
this inequality is equivalent to 


( 1 ) 


(Sxl)" 


^ Cl 


where Ci is another positive constant. 
Now we consider the statistic 


s P Nx‘ N 

^ N -l~ Sxi- Nx’‘ ~ Sxi/x^ - N 
from which we have 

Sx*./x’‘= (AT/r') + N. 

^Vlso note that 

2(ji*/st) = (1 - k/N) + V(1 - k/Ny + (ik/N^(Sxl/¥j 

(and this is true whether * is positive or negative). Now we can write the criti- 
cn,! region ( 1 ) as 

ifi*/xy-'‘in*/x - 1 )"+* 

iSxi/xY - 


or 

[1 - k/N + V(X~-^7W + (4fc/iV)(l+ 1/TY' 

•[-1 - k/N + V(1 - k/N)^ + (4fc/Ar)(l + l/r*)]"+" g Cs 

where C2 is another positive constant. We denote the left side of this inequality 
by and it can be shown that ^(T^) is a monotone decreasing function of T*. 
Thus since the critical i-egion is defined by the relation ^{T^) ^ constant and 
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the critical region using ‘‘Student’s is jT^ ^ constant, these procedures are 
exactly equivalent. 

4 • A Problem in statistical decisions. The question which aroused the interest 
of the writer in statistical de(;isions is the following one of multivariate statistical 
analysis. Suppose a:i, X2, * • , Xj, are known to be normally distributed with 
unknown means and unkno\\Ti variances and covariances, and on the basis of 
a set of N independent observations, a test is to be made of the hypothesis 
E{x^ == E(x^ = ... = E{xp) == 0 . Such a test may be carried out by using 
the generalized Student Ratio [ 4 ], and the hypothesis is either to be rejected or 
accepted as a whole. But consider the case in which the null hypothesis is 
rejected; it seems quite natural to ask for a more enlightening statement. Is it 
not possible to say that on the basis of the sample, the hypothesis should l>e 
rejected for Xi^ , , • • • , but not rejected for , • • • , Thus 

we seek a division of the sample space into 2 '* mutually exclusive regions, each 
of which will lead us to reject the hypothesis of zero expe(*ted values for a par> 
ticular set of the XiS and to acc^ept it for the remaining set. 

We will consider a solution of the problem in the case that the ciov ariance 
matrix of the joint normal distribution is known, and will motivate that solution 
by considering first the case of two variables. 

Suppose that X and Y are normally and independently distributed with un¬ 
known means, a and / 3 , and wth unit variances. The joint probability density 
function is then of the form 

/(Z, }') = 

The set of hypotheses is given as follows: 

H\ is the hypothesis that a = 0 and jS = 0 

H2 is the hypothesis that a ^ 0 and = 0 

Hi is the hypothesis that a = 0 and 5*^ 0 

Hi is the hypothesis that a 9^ 0 and 5^ 0. 

We have a sample of N independent pairs of observatioiLs (A% , Ya) where cr — 
1 , 2 , • •' , AT; then the density function in the 2 N dimensional sample space is 

(1 

We seek the set of regions Mi , M2, Ms, Mi in the sample space which are 
chosen such that if the sample point E falls in Mi , we accept the hypothesis Hi . 
We take the following as the values of the losses if the wrong decision is reached: 

I. If Hi is accepted, 

i) for any parameter point (a, / 3 ), the loss is a continuous function of 
(a -f jS*), say W{a -f which is zero for a = =* 0, is differentiable, 

strictly monotonically increasing, and possesses a finite maximum 
when multiplied by the normal density function. 
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II. If Ht is accepted, 

i) for any parameter point (a, jS) except ( 0 , 0 ), the loss is If 08 *) where 
If is the same function as above, 

ii) the loss is Ifi if the true parameter point is (0, 0), 

III. If ^8 is accepted, 

i) for any parameter point (a, j8) except (0, 0), the loss is If (a*) where 
If is the same function as above, 

ii) the loss is Tfi if the true parameter point is (0, 0). 

IV. If Hi is accepted, 

i) the loss is If2 if the true parameter point is either (a, 0) for a 
or (0, j8) for i8 0 

ii) the loss is If j if the true parameter point is ( 0 , 0 ) 

where Ifi, If2, and Ifa are constants subject to some slight restrictions which 
will be pointed out later. 

The functions J?<(8, E) are then the following: 

Ri(e, E) = W{c? + ^)G(a, ff) 

= 0 

E) = W[^)G{a, fi) 

- If iG(0, 0) 

= 0 

A,(8, E) - Wia^) 0 (a, fi) 

= WiG{0, 0 ) 

= 0 

Rii$, E) = lf2G(a, 0 ) 

= lf*(?(0, P) 

= IfaGCO, 0) 

= 0 

where (?(«, p) is the normal distribution function 

X and y being the sample means. It should be pointed out that the use of the 
distribution of the sample means histead of the joint distribution of the observa¬ 
tions is justified since the sample means are sufficient statistics for the parameters 
a and p. 

We will use the notation R^iE) to denote the maximum of Rt{B^ E) with respect 
to a and p, and it can easily be seen to be the maximum of two expressions which 
we will denote by 11 ( 1 ) and 11 ( 2 ) where 11 ( 1 ) is the maximum of If 08 *)G(o£, p) 
and 11 ( 2 ) is the maximum of lfiG( 0 , 0 ). Similarly, Ri{E) is the maximum of 
III(l) and III(2), and Ri{E) is the maximum of IV( 1 ), IV( 2 ), and IV( 3 ), where 
these are the maxima of the two expressions involved in Rz{B^ E), and the three 
expressions in ^4(8, E), respectively. 

We will first show that the function Ri(E) is a monotonic increasing function 
of (x* + y*). We know that the maximum of Ri{ 6 y E) is reached for values of 


ioT a + f 
for a = i8 *= 0 
ior P ^ 0 
for a p == 0 
for a 5^ 0, /8 = 0 
for a 9^ 0 
for a = /8 = 0 
for a = 0, /8 5^ 0 
for a 9^ Of p ^ 0 
for a = 0, /8 5^ 0 
for a = i8 = 0 
for aP ^ 0 
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a and for which the partial derivatives of Ri{By E) ^th respect to o and fi 
at« ssero, i.e., for whi<ih 

[Nix - a) Wi:a^ + + 2aW'ia^ + /3®)](?(a, - 0 

and 

^ [N(y - +/) + 2fiW'(c^ + ^)](?(a, /?) = 0 

where W'ia + is the derivative of Wia + with respect to (a + 
Since (?(«, fi) 9^ 0, and W'ia + ^ 0, these relations imply,. 



fix =^yayi. Thus the maximum of the function Riid, E) occurs for values of 
a and fi which satisfy the relation a =» ix/y)fi, ;: . 

Consider any two straight lines a ~ ix'/y')fi and a = and;the 

values of the function Ri(6, E) along these two lines. Obviously, the values of 
the first factor Wia + fi^) are equal for points along the lines equidistant from 
the origin. Also, if the values of x', y\ x", and y^' axe such that x'* + = 

the values of the function Gia, fi) along both lines are equal for points 
equidistant from the origin, and it follows that fii(x', 2/') = Riix'\ 2/"). Thus 
we have that R\iE) is a function of (x* + y^). 

Note that if the value of x"^ + 2/"* is greater than the value of x'^ + 2 /'^ the 
curve representing the function Cr(a, fi) along a = ix^'/y'')fi is the same as that 
along the line a = ix^/y')fiy but it is shifted further from the origii*. The values 
of Wio? + ^) are independent of x and y and the function is monotonic in 
a® + ^3. Thus, the value of G(a, fi) for which RiiOy E) is a maximum on a = 
multiplies a larger value of Wia^ + fi^) than on a = (x"/ 2 /")/ 5 , so the 
maximum when x"^ + y"^ exceeds x'^ is the greater. But this proves that 
RiiE) is monotonically increasing in (x® + y^)- 
In a similar manner, we now proceed to show that 11 ( 1 ) is a monotonically 
increasing function of y\ We know that a necessarj’’ condition for a maximum 
of 11 ( 1 ) is that 

^II(l) ^ dlljl) ^ 

da dfi 

The first of these two relations is 

Wif)Nix - a)Giay fi) -=0 

which has the solutions Wifi^) = 0 and a = x. But W(^) = 0 only for jS = 0 
and this value is a minimum of 11 ( 1 ), hence we have that,the maximum is reached 
for a = X, so 

11 ( 1 ) = max. of 
fi 

But along any two lines a = constant in the (a, j 8 )-plane, the function Wifii^) 
has identical monotonically increasing values in ^ and the normal density 
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m 


function is identical along two such lines for a fixed value of y\ An increase in 
the value of displaces the normal function from the origin but does not affect 
its shape, hence the value of the normal density function at which 11(1) takes on 
its maximum is multiplied by a greater value of W(f^) when is increased, so 
11(1) is monotonically increasing in In exactly the same manner, we find 
that 111(1) is a monotonically increasing function of 

Because the remaining functions are identical with the functions considered 
in the special case above, we have that 

11 ( 2 ) = 

III(2) = 

IV(1) - 

IV(2) = TTaCc”*"'** 

IV(3) = 

Now it is apparent that Ri{E) is never less than 11(1) since 

W(a^ + ^)Gia, 0) ^ WiP^)G{a, ff) 

(the equality holds only for a — 0) and since a function which is never less than 
a second function cannot have a maximum less than the maximum of the second 
function. Also Ri{E) for the same reason is never less than III(l). Thus Ri{E) 
can be the minimum of the four functions Ri(E) at most when RziE) is defined 
by 11(2) and Ri(E) is defined by III(2). 

Since 11(2) and III (2) are the same monotonic decreasing function of (x* + 
y^) and since Ri(E) is a monotonic increasing function of (x^ + y*), there is a 
value r\ of (x^ + y^) such that Ri{E) < 11(2) when and only when x* + 2/* < . 

But for all values (x, y) we have that Ri{E) g 11(1) and Ri{E) ^ III(l), hence 
for all values within the circle x* + 2 /* = r? we have that 


(2) 

11 (1) g Rm < 11(2) ' 

and 


(3) 

111 (1) ^ Ri{E) < 111(2) 


so it follows that R2{E) is defined by 11(2) and RiiE) is defined by III(2) within 
the circle. 

We restrict the values of Wi , W 2 , and Wz used in the definitions of the weight 
fimctions to be g W 2 ^ Wz , hence for all values of (x, y) 

and 

so Ra{E) is at least as great as 11(2) over the whole plane; hence, in light of 
relation (2), Ri{E) is at least as great as Ri{E) for x* + 2/* S »*o * Therefore, 
since (2) shows that Ri{E) < RziE) within the circle; (3) shows that R\{E) < 
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Rz{E) within the circle; and since quite obviously the relaticms do not hold 
outside the circle, we have that Mi is the set of points 

x^ + y^ <tI. 

To determine the region , we must determine those points outside M\ for 
which RiiE) < R^{E) and R^iE) < Ra{E), Consider first the part of the i^ane 
outside Ml for which R^{E) is defined by 11(2). This is the region for which 
11(2) > 11(1). Consider the curve in the plane defined by 11(2) = 11(1), that is, 

We take differentials and have 

-N(x^ + + ydy] = 2j/[rfII(l)/d(j/*)]dy 

but this shows that dy/dx has the opposite sign from y/x since dll(l)/d(y“) is 
always positive. Also note that for x ~ 0, the equation RiiE) = 11(2) is identi¬ 
cal with the equation 11(1) = 11(2), so for x = 0, we have 11(1) > 11(2) when 
\y\ > ro and 11(1) < 11(2) when \y\<ro. Furthermore, the curve 11(1) = 
11(2) crosses the x axis at a finite value of x, since for y = 0,11(1) is a constant 
while 11(2) is a decreasing function of x. 

We will refer to the various regions in the first quadrant of the (x, 2 /)-plane 
shown in Figure I as follows: A is the part of the quadrant which is Mi ; A, B, 
and C are the regions in which R 2 (E) is defined by 11(2), that is, in which 
11(2) > 11(1); and in the same manner, A , B, B\ and C' are the regions in which 
RziE) is defined by III(2). 

Since 11(2) and III(2) are identical, we see that within the regions B and B', 
Ri(E) = Ri{E) since in these regions R2iE) is defined by 11(2) and RiiE) is 
defined by III (2). We have previously pointed out that 11(2) is never greater 
than RiiE), hence it is g^lear that B and B' should belong to either M 2 or Ms, 
and we will arbitrarily decide that B is part of M 2 and B' part of Ms. 

Consider then the region C ; here RoiE) is defined by 11(2) and Ri(E) by 111(1), 
so within C 

11(2) - III(2) < III(l) - RiiE) 

and again 11(2) S R^iE), so the region C is part of Mo. By the same argument 
we have that C' is a part of Ms since within C' 

III(2) = 11(2) < 11(1) = RiiE) 

and III(2) ^ R^iE). 

Now consider the remainder of the quadrant outside .1, B, B', C, and C'. 
Here R2iE) is defined by 11(1) and RiiE) is defined by 111(1). Since 11(1) is 
the same monotone increasing function of y^ as III(l) is of we have 11(1) > 
III(l) for I y I > I X I and 11(1) < III(l) for [ x [ > \ y\. Thus we see that in 
the region under discussion, R2iE) is a minimum at most in the regions D ai\d 
E apd RiiE) a minimum at most in D' and E\ 
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In order to determine then, that part of D and E which belongs to , we 
seek the region for which 

11(1) < IV(1) when Ra{E) is defined by IV(1) 

11 (1) < IV(2) when R,{E) is defined by IV(2) 

11(1) < IV(3) when R^{E) is defined by IV(3), 

But within D and E we have that so it follows that IV(1) > IV(2) so 

R 4 ,{E) is never defined by IV(2) vaDoxE, Hence we need determine the points 
which satisfy the first and third of these relations. Now it is clear that the 
relation 11(1) < IV(1) is equivalent to the relation \ y\< yo for some value 
yo since 11(1) is monotonically increasing in and IV(1) is monotonically de¬ 
creasing in y®. Let y * yo be the line dividing D and E, 

We impose a restriction on Wz such that D is part of Mz and E is part of Mi . 



234 


RALPH J. BROOKNEK 


This restriction is that within E, IV(3) ^ IV(1); note that since we are con¬ 
cerned only with | y | < | a; | , this imposes the greatest restriction on TFa when 
= 2/ = 2/o > so we are requiring that 


or 


It is simple to see that because of symmetry with respect to both axes and the 
origin, M 2 is defined by > rl and 1 2 /1 < 1 x | and \ y\< yo ; Mz hy 

+ y^ > rl and 1 a; | < 1 2 /1 and \x \ < Xo; and Ma by + y^ > rl and \ y\> 
2/0 and I a; 1 > Xo. It should be pointed out that a:o = 2 / 0 . 

We now consider the general case with a known covariance matrix. Con¬ 
sider the joint normally distributed variates Xf , X? , • • * , Xj whose covari¬ 
ance matrix is 11 cr*,- 1 [ (t, j = 1, 2, • • • , p), where the are all known and 
where |1 <r?,* || is positive definite. The mean values of the X*'s are jSi, ft , • • * , 
ft which are irnknovn. It is simple to see that we can consider new variates 
X< = Xj/Vof^ whose mean values are a,- == and whose covariance 

matrix is || Viy || where an =1. If a sample of N independent observations on 
the Xj's are given, we have immediately the observations on the Xt*s, and we 
denote the sample means of the XiS by xi, J 2 , • • • , a^p , respectively. 

There are 2^ hypotheses among which we wish to choose; as notation, we let 

Hq })e ai = a 2 = * * * == ap == 0 

Hi be oji 5 ^ 0 , flt 2 = ojs = • • • = otp == 0 

1/2 be a 2 7 *^ 0 , ai = as = • • • = ap = 0 

Hi 2 be aia 2 3 ?^ 0, as = 04 = * •' = ap = 0 

etc. As a further abbreviation, let //^ denote any one of the p hypotheses Hi , 
H 2 , • • • i Hp ; let H^ denote any of the ( 2 ) hypotheses Hn , Hiz , • • • ; denote 
any of the (?) hypotheses Hm , Hm , * */ ; etc. Also let he the region 

of the sample space for which we accept the hypothesis > and let 

- I ^ ^he risk density function if the hy¬ 
pothesis is chosen, where we have used the notation 6 to represent the 

parameter point ai, a 2 , • * • , ap . 

We will also adopt the following notations: in referring to the parameter point 
(ai, a 2 , • • • , ap), we ’will write (ii , ^ 2 , * • • , 4) == 0 to mean all points for 
which a,-, = a^g = • • • == a,* == 0 and (ayi)(ayg) • • • (a/,) ^ 0 where ii , ^ 2 , 
• * * > ii > ja, • * • , j» are a permutation of the integers 1,2, • • • , p. Further¬ 

more, we will write L/i, ^’2 , • • • , j$] 5 ^ 0 to mean (ii , ts, • * • , u) = 0 . 

By Q we denote the covariance matrix of the X,*’s and by L its inverse; we will 
denote the elements of L by . By Q*^^***'** we denote the matrix obtained by 
striking out rows u , 4 , * • • , ik and columns ii^izy * • • , 4from Q; by X,*i**‘*‘** 
we denote the inverse of the matrix *•*, and we will write the elements of 
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jyg . Thus we can write the joint distributiOT of the set of 

sample means ari, xj, • • • , :Cp as 

Concerning the definition of the weight function, we will assume the fallowing: 

I. If ^To is accepted, 

i) the loss is TrCSSAijatay) if the true parameter point is (ai, 
^ 2 , • • • , otp)y where TF is a continuous, strictly monotonic increas¬ 
ing function whose value is zero if (1, 2, • • • , p) = 0. The func¬ 
tion is restricted to increase slowly enough that the product of it 
and the density function (4) has a finite maximum with respect to 
the aiS 

II. If is accepted, 

i) consider in particular /f«, then for all parameter points except 
(1, 2, • • • , p) =0, the value of the loss is Tr(SSX“ya,a/), where W 
is the function defined, above. 

ii) the loss is TTi if the true parameter point is (1, 2, • • • , p) = 0. 

III. If is accepted, 

i) consider in particular Hahy then for all parameter points except 
( 1 , 2 , • • • , p) = 0 and [a] 5 ^ 0 and [h] 9 ^ 0, the loss is Vr(SSX®yaiQ|y), 
where W is the function defined above, 

ii) the loss is W\ if the true parameter point is either [a] 5 ^ 0 or [ 6 ] 9 ^ 0, 
where Wl ^ Wl, 

iii) the loss is-Wl if the true parameter point is (1, 2, • • • , p) = 0 where 

In general; if IZ* is accepted, 

i) consider in particular , then for all parameter points except. 

(1> 2, • • • , p) = 0, [t’l] 7 ^ 0, { 1 * 2 ] 0> • • * > [^1 > 4] 9^ 0, [z’l, t’a] 3 ^ 0, * • ■ , 

etc., the loss is TF’(22Xjy’*’****afai), 

ii) the loss is TFj (r = 1, 2, • • • , A; — 1) if [iji, , • * * , f yj ^ where 

ii, j 2 , • • * , jr are r different positive integers less than or equal to k. 
Also WU ^ TFt 2 g ^ Wl, Wtzl ^ WL 2 ,Wtzl g Wizl g 

etc. 

iii) the loss is Wq if (1, 2, • • • , p) = 0, where Wt ^ Wo, 

where the Wi are constants subject to some further slight restrictions which we 
will impose later. The SS has been used throughout to denote summation over 
all values which i and j take on in L^i** " **. 

We consider first the risk density function corresponding to Ho , that is 

Ro{e, E) = 

To maximize ^o(^, ^), we have the set of p equations obtained by setting the 
p partials of Roifi, E) with respect to the a< equal to zero, which are necessary 
conditions. We have 
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SO the necessary conditions are 

g + - aj)]W « 0 (t « 1, 2, , p). 

This can also be written 

{21\ijaj)D,W{z) + Wiz)NX\ii{xi - ay) = 0 

where we have set « = Z2X,ya«ay and where we use the notation D, to indicate 
differentiation with respect to z. Fix i at two particular values, say a and 6; 
then two of the equations of this system can be written 

(2i:\aiOL;)D,W{z) + W{z)N2,\ai{Xi - ay) = 0 

(2SXftyay)Z)elF(3) + W(z)Nj 0 \hj{x j — ay) = 0 

that is 

(5:X«yay)[2:X6y(xy - ay)] = (2X6yay)[SXoy(xy ~ ay)] 
or 

(2Xayay)(2XbyXy) == (2X(yay)(2XoyXy). 

This we can write as 

XX\aj\bkOljXk = SSXbfcXa/a/fcXy 
or 

XXXaj^bkiocjXk — ajfcXy) = 0. 

Giving a and b the combinations of values which are possible, this is a set of 
p^ linear homogeneous equations in the p* unknowns {ajXk — akXj) which has the 
obvious solution ayXfc — ai^j == 0 or ujXk = ajbXy. 

Thus we have that the maximum of the function Ro{6, E) is reached for a set 
of values of the a,-'s which lie on the straight line 

(6) oti = (x,/xi)ai. 

The function Ro{E), which is the maximum of Ro(6y E) with respect to the 
a*’s is a monotonically increasing fimction of (SSX»yx,xy), which we show in the 
following manner. Because of (5), we see that 

SSX,-y(x,- - a,)(xy - ay) = 22X,y[x< — {Xi/Xi)a^[Xj - (xy/xi)ai] 

= SXXifiHixAl - (aiM)f. 

Also, 

22X»yat’ay ~ 22XtyX»Xy(ai/xi) • 

Hence we see that Ro{E) is the maximum with respect to w of 
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SO for two sample points i^' « (a?!, ajJ, • • • , Xp) and J?" = {xu * • * > 
such that ZZ\ijiXix] = it is clear that RaiE*) == thus tU{E) 

is a function of ZZXijXaj. 

But then without loss of generality, we can consider Ilo(£^) along the Xi axis, 
i.e. for = 0:3 *= • • • = ar^ = 0. Using relation (5), we see that this implies 
that the maximizing parameter values are a 2 =® «5 ~ * * * = = 0. But then 

Ilo(E) = max. of 

which we have previously shown is a monotonic increasing function of xf . 
Therefore Jlo(E) is a monotonic increasing function of SSX{/a;<a;/. 

We will furthermore show that the maximum of each risk density function 
corresponding to parts 7 ') as given in the weight functions are monotonically 
increasing functions of certain quadratic forms in the Xi . Consider for example 
the function corresponding to part i) of Ri(6f E), that is 

We will write the maximum of this function with respect to the a/s as Ri(i). 
Note that the weight function is not a function of ai , hence the partial derivative 
of (6) with respect to ai set equal to zero is equivalent to 

SXiXojy - «/) = 0. 

Squaring this relation and multiplying by iV'/2Xii gives 

{N/2\ii)ZX\i^u(xi - ai)(xs - ay) = 0 
so we can write the exponent in (6) 

Exp, = -(A^/2Xu)S2(XuX<y - Xi.Xiy)(x, - a,-)(xy - ay). 

Because of the definition of X<y, if we write w,y for the cofactor of o-yy in 1 Ciy |, 
wii have 

Exp. ~ ‘~“[A/2Xii(| Vt’y j) ]w2(<iJiiW»y ”** c*>it<*>iy)(x,‘ Oli^{Xj ay). 

But by a well known algebraic identity*, 

«ii«*y — witwiy = I (Tij I • [cofactor of (o-uo-yy — <rucrij) in | <riy |] 

= I (Tij 1 -wiy 

where we have written wly to be the cofactor of <r»y in | (tJ,- | , so 
Exp. = -(N/2\n I O-yy l)S2t»jV,’fe ~ at)(xy — ay). 

But Xu I v<y I = wii = I <r\j j , hence 

Exp. = -^ 2:2Xiy(®,- - a.)(xy - ay). 

Therefore 

fii(i) = max. of fr(SSXlyaiay)Ce-‘"'**"5<''‘““*‘‘*'““'\ 

all a,'’H 

• See M. Bocher, Introduction to Higher Algebra. 
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^ But then it follows in exactly the same way as with R^iE) that Ri{i) is a 
monotonically increasing function of . For the other functions Rkd) 

corresponding to other hypotheses the argument is identical, and for risk 
density functions corresponding to hypotheses with more than one cii 9 ^ 0, 
tho same argument is repeated two or more times in succession to give the result. 
We will show that for any value of the parameters ai, aj, • • • , ap the relation 

holds. This relation is true if the relation 

(7) 22[(ajo/ I <^iJ 1) — 1 S 0 

is true where we define wi/ = 0. That is, if 

(1/ I or<y II O-Jy DSScDijOii — wj-y | (Tyy |)aiOiy ^ 0 

where we have substituted wu for its equal | <rly | . But note that 
coyy = cofactor of (<rii<riy — (TitCTiy) in | (r,; | 
hence by the identity quoted (see footnote 2) 

1 <r<y I wjy == (tiiiWij — coit-Wiy 

so the left hand member of relation (7) is 

(1/1 (Tiy II (r<y pss(wiywii — wuWfy + wi,*(«Jiy)afO:y 

= (1/1 cTij II Cij DSSoJi.-cuiyariay 

since all matrices here are symmetric and positive definite. Note that the 
argument can be repeated one or more times to show 

TT(22Xo-aiay) ^ W(S2X;'}’’*-V^y) 
or 

W(S2X;f ®**-''a,ay) ^ TF(22X:5’’='*'’Wy) 

where iiH , • • • ,4 are any set of k different integers less than or equal to p, 
and jij 2 • • • , i, are any subset of iii 2 • '*• , 4 . 

Consider the maximum of the expressions 

We know that (p — r) of the ay’s in these expressions are zero and by an argu¬ 
ment similar to that given above®, it is clear that if the r ay’s not equal to zero 
are otyj, ay*, • • • , ay,, then the maximum of the expressions is given by 


»See p. 36. 
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Also for r = 0, the maximum is obviously 

Recall that we have restricted the TTJ's so that 

(8) 1^0 ^ TT? ^ ^ WS and ^ ^ Wt. 

From a previous calculation, it follows that 

(9) SSXiiXiX,- ^ ssx;jx<x ^ ^ . 

We can then quite easily calculate the region Afo, that is, the region of the 
sample space for which Ro{E) is the minimum of all the We 

have pointed out that 



Tr(SSX<,«.a,) ^ Tf(22Xj}*“ ” Va,) 

so it follows that 


(10) 

Ri>(.E) ^ iJi, <,...{*(») 

that is 

Ro{E) ^ 

so long as 

i^{E} is defined by 


From the relations (8) and (9), we have that 

(11) g 

for k = 2, 3, • • • , p. Now because 

is a monotonic deceasing function of X^XifCiXj , and because Ro{E) is a mono- 
t onically increasing function of SSXt jX,Xy, there is a value rl such that within 
the ellipse S2X* jX,xy = rl , the relation 

( 12 ) Ro{E) < 

holds, and outside it the opposite inequality holds. But from relations (10) 
and (12), it follows that mthhi this ellipse, no Ri^i^...ij^(E) except Rq(E) can be 
defined by <,...♦**(«). Then in view of relation (11) and since a quantity is 
certainly less than the maximum of several quantities if it is less than one of 
those several quantities, the region Mo is the set of points S 2 X, 7 X<xy < rj . 

Now consider the functions Ra{E) in the region outside Mo . We know that 
RaiE) = Raii) when 

max. of S 
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and we will write RiiE) = Ri(ii) when the opposite inequality holds. Consider 
a part of the sample space outside Mo in which 

Ri,{E) = Ri,(it) 

Ri,(E) = Ri,(ii) 


Ri,(E) - Ri,(ii) 

where A; ^ 1, and where RjiE) 9 ^ Rjiti) for j 5 ^ n , , • • • , ik . We see in this 

case that Ri^{E) = Ri^{E) = ••• = Ri^{E) < R}{E), where again j 9 ^ ii, 
1 ^ 2 » *' • , u . Furthermore, in this case, because of the relation (11), we have 
that E should be a point of either ilf • or, Mi ^. We will arbitrarily 

decide in this case that E should be a point of (« an integer ^ k) where 
i» is determined so that 

XlX'i^jXiXj S ^^yiijXiXj for any < = 1, 2, • • • , A:. 

Now consider the region in which RriE) = J5r(0 for all r = 1, 2, • • • , p. 

We see tliat each Rr(t) is the same monotonically increasing function of a quad¬ 
ratic form of the type ^'Si}JijXiXj . Hence in order that ^ be a point of a par¬ 
ticular ilf r, it is necessary that 

(13) ^X^ijXiXj ^ ^'^\\jXiXi for all s 9 ^ r. 

Now let us consider a fixed r and compare Rrii) with all Hri^(E )^s for A? ^ 1* 

We have pointed out that 

(14) ^X^iiXiXj ^ 

so Rrii) ^ Rriiiz-.-iitii) and hence Rrii) can be a minimum at most when all 
Rriiiz.-.iitiEys are defined by other than iZr*,i 2 ...t*(t). 

Consider then, any Rri^^) when defined by other than iJr.i(i), that is when 
RriiiE) is equal to one of 

^ (say) 

Because of the relations (8) and (14), we have that 

Rri,iE) 

whenever these are defined by other than and Furthermon> 

in the region defined by (13), we see that A<,(m) S hence (E) is 

never defined by Rriiiiii) in this region. 

Now the relation Rr(.i) < Rriidi) is easily seen to be equivalent to the relation 

(15) XSyiiX,^j < rj 
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for some value n . With the restriction on Wl that it be not so much lar^r 
than Wl that when (12) does not hold, RTi^(E) is not defined by Rriiiw)^ we have 
that the region for which Rr(i) < /Er<j,-,...<*(]&) is the region defined by (18) and 
(16). 

We then restrict the relationship between the constants Wl and TFo to be 
such that for all points outside of Mo but within the region defined by (13) and 
(16), the relation ^ IXX'ijXaj holds for ji , ,jk each 

different from r. Note that this is not an unreasonable restriction since the right 
hand side of the relation is bounded above by rf, XXXijXiXjis bounded below by 
rl , and therefore, is bounded below by some positive value 

where r* is a monotonically increasing function of rl . 

Using a similar method, the r^on can be obtained after all regions 

all m < have been derived. If some further restrictions are 
imposed on the constants in the weight functions similar to those formulated 
in deriving the region Mr, it can be shown that the region Mi^i^...i^{k ^ 1) 
will be given by the inequalities 

22\%jXiXj ^ 9*0 

^ rii for all m < k and all ji, • • • ,jm 

22X:j‘**’*x»xy S SSXif*“''*XiXy for all 

and 

2SXj)‘^ "S.xy < rl. 

Thus we have rationalized the following solution of the question posed at the 
beginning of section 4. We test the hypothesis E{xi) = E{x 2 ) = • • • = 
E{xp) = 0 using the generalized Student ratio replacing the sample covariance 
matrix by the population covarian<^ matrix since the latter is assumed to be 
known, at some chosen level of significance. If the hypothesis is not rejected, 
we make the decision corresponding to Ho . If the ratio is significant, we com¬ 
pute the ratios T\ • , where by definition is the generalized 

Student ratio computed for xj^ , iC/a > * * * ^ > * 2 , • • * , 4 , ji, i 2 , * • • , 

is a permutation of the integers 1, 2, • • , p), the variates x., , , • • • , 

being ignored. 

We consider the smallest of the ratios computed on the basis of (p — 1) of 
the Xt’s; say it is T\ Then if is not significant at some level of significance 
(which need not be the same level as considered before), we make the decision 
corresponding to Hr ; if 7^ is significant, we compute all the ratios based on 
(p — 2) of the x’s. If T’’* is the smallest of these, we make the decision cor¬ 
responding to Hr8 if 7’*^* is not significant but proceed to calculate the ratios based 
on (p — 3) of the x,’s if it Is significant, and so on. 

5* Concluding remarks. It should be pointed out that while the derivation 
of the explicit inequalities defining the various regions of acceptance may be 
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rather involved, for any given sample point E, it is relatively simple to determine 
the region of acceptance to which this point E belongs. That is, we calculate 
the various values Ri^i^.^-i^{E) and choose the decision if 

is the minimum of the values of Ri^i^...^|,iE) for all values of ti, ^, • • • , u. 
For making a decision on the basis of a given sample point Ej it is not necessary 
to find explicit anal 3 d:ic formulas defining the shapes of the various regions of 
acceptance. 

Since the principle used here is proposed merely as a substitute for Wald's 

principle for the sake of mathematical simplification, it is felt that in certain 

problems Wald's principle may be used as a check on the results. For example, 

it is felt that the new principle is apt to lead to decision regions of the piroper 

shape though the exact sizes of these regions may not be correct. In cases where 

the decision regions cannot be determined by Wald's principle, it seems possible 

that a determination may be made in Wald's sense among the various decision 

regions having the same shapes as those given by the new principle. In the 

case considered here, for example, it may be possible to determine new values of 
2 2 2 
ro, ri, • • • , . 

I should like to express my very great appreciation to Professor H. Hotelling 
for many suggestions during the preparation of this paper and to Professor A. 
Wald for constant guidance. I should also like to credit Professor Helen Walker 
with originally posing the question that led to this research. 
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A TWO-SAMFLB TEST FOR A LINEAR HYPOTHESIS WHOSE POWER 
IS INDEPENDENT OF THE VARIANCE 

By Charles Steik 
Asheville, N. C, 

1. Introduction. In a paper in the Annals of Maihemaiical Statistics, Dant* 
zig [ 1 ] proves that, for a sample 9 ! fixed size, there does not exist a test for Stu¬ 
dent’s hypothesis whose/power i^ iiidependent of the variance. Here, a two- 
sample test with this property will be presented, the size of the second sample 
depending upon the result of the first. The problem of determining confidence 
intervals, of preassigned length and confidence coefficient, for the mean of a 
normal distribution with unknown variance is solved, by the same procedure. 
These considerations including the non-existence of a single-sample test whose 
power is independent of the varianpe, 4re extended to the case of a linear hy¬ 
pothesis. In o^er to make the poWer of a test or the length of a confidence 
interval exactly independent of the variance, it appears necessary to waste a 
small part of the information. Thus, in practical applications, one will not use 
a test with this property, but rather a test which is uniformly more powerful, or 
an interval of the same length, whose confidence coefficient is a function of a, 
but always greater than the desired value, the difference usually being slight, at 
the same time reducing the expected number of observations by a small amount. 

Any two sample procedure, such as that discussed in this paper, can be con¬ 
sidered a special case of sequential analysis developed by Wald [5]. 

The problem of whether these tests and confidence inteiwals are in any sense 
optimum is unsolved. It is difficult even to formulate a definition of an optimum 
among sequential tests of a hypotheii| against multiple alternatives. However 
it is shown that, if the variance and mitial sample size are sufficiently large, the 
expected number of observations differs only slightly from the number of ob¬ 
servations required for a single-sample test when the variance is known. It also 
seems likely that the confidence intervals do possess some optimum property 
among the class of all two-sample procedures. 

Although Student’s hypothesis is a special case of a linear hypothesis, it is 
treated separately, because it illustrates the basic idea without any complicated 
notation or new distributions. The test for Student’s hypothesis involves the 
use only of Student’s distribution, even for the power of the test, while the power 
function of the test proposed here for a linear hypothesis involves a new t 3 npe of 
non-central ^-distribution. 

The notation is used as a generic symbol for a random variable equal to 
the sum of squares of n independently normally distributed random variables 
with mean 0 and variance 1, i.e., Xn has the x* distribution with n degrees of 
freedom. 


243 



244 


CHABIiBS STXm 


P{x\ < T] 


(V2)"r(|n) 


f 




du 


for r > 0 


= 0 


for r :< 0. 


Hie notation (» is used as a generic symbol for where x is normally dis- 

X« 

tributed with mean 0 and variance 1, independently of Xn , i<e., <•> has the dis¬ 
tribution of Student’s t with n degrees of freedom, 


P{tn < «} 


r(i(« + D) 

v^rdn) 




“i(n+l) 


de. 


^m,n is a generic symbol for a random variable of the form Fn,n = wxm/^Xn, 
the numerator and denominator being independently distributed, i.e., Ftn,n has 
the distribution of an F-ratio with m and n degrees of freedom, 


P{F^,n < T} 


r(Kw + n)) 

r(Jm)r(Jn) 




dF. 


A symbol of the above type with an additional subscript a denotes the upper 
100a% significance level, e.g., tn,a is defined by 


P{tn > tn.a] = a. 


The symbol E{x\ Q(x) 1 denotes the set of all x such that the condition Q(x) 
holds. This should not be confused with E(x | T), which denotes the expected 
value of a random variable x, given the conditions T, 

The size of a critical region is the probability that the sample point will lie 
within the region under the null hypothesis. The terms length and volume, as 
applied to confidence regions are used in the ordinary geometrical sense. 

If"'' 

2. The test for Student’s hypothesis. Suppose Xi, i - 1, 2, * • * are inde¬ 
pendently normally distributed with mean { and variance We wish to test 
the h 3 rpothesis f =*= fo, the power of the test to depend only upon € — $o, not 
upon <r*. For this purpose we define a statistic /' as follows. A sample of no 
observations, is taken, and the sample estimate, $^, of the variance 

computed by 



Then n is defined by 


( 2 ) 


n — max 



+ 1, no + 1 


where e is a previously specified positive constant, [g] denoting the smallest 
integer less than g. Additional observations, rrne+i, * * * , are taken, and, in 



A TWO BAMSm T3BST 


accordance with an initially specified rule depending only upon real numbers 
a*, t = 1 • • • n are chosen in such a way that 


(3) 


2 a< = 1, ai ~ 02 


OfflQ 


La! 


0. 


S a* “ ^ ^ by (2), 


This is clearly possible since 

(4) 

the minimum being taken subject to the conditions 

n 

2 Oi = 1, Oi 02 = • • • = Ono 

1 

Then t' is defined by 

n n 

L a< - €« L «.(*< - £) 

t' = — 

(5) 


VS 

■ £- £b 

VS ’ 


Vs 


+ 


Vs 


where 

( 6 ) 


L Oj(a:. - {) 


u = 


Thei\ u has the distribution of StiiQBBr < with n© — 1 degrees of freedom, re¬ 
gardless of the value of For (n© — bas the distribution of 

the conditional distribution of given s, is normal with 

mean 0 and variance clayz = BuWhe usual form of a random variable 

tn^^i is /no~i = y/Sf y being normally distributed with mean 0 and variance <r*, 
and (no — having the distribution of Xno-i > independent of y. Thus the 

conditional distribution of u, given a, is normal with mean 0 and variance o’V^^ 
so that and u have the same distribution. 

This theorem can be used to obtain an unbiased test for the hypothesis H 9 
that f = to, the power being independent of or®, which is supposed unknown. 
Let a be the desired size of the critical region and let <no-i,a /2 be such that 


( 7 ) 
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Then if we reject Ho whenever 

ISOilv - {o| 

( 8 ) ^ ^ 0 — 1,«/2 } 

we obtain an unbiased test of Hq , whose power function is 1 — j8(f) where 

(9) fid) ** p|—<no~l.«/2 + < ^0-1 < ^no-l.«/2 + 

Tlie fact that the test is unbiased follows immediately from the symmetry and 
unimodality of the t distribution. 

If we wish to test the h 3 rpothe 8 is Hoi^ = & against onensided alternatives 
f > fo, the procedure is similar. The critical region of size a is defined by 


( 10 ) 

and the power function is 

( 11 ) 


2 “ & 




^ ^0—1,a 


1 - m = 


A confidence interval for f, of predetermined length I and confidence co¬ 
efficient 1 — a can be obtained by selecting e so that 


1 - a = P<i-S77= < <«o-i < 


= P 


( 12 ) 


2'V^e 

I 

2V'e 

I 





2V'« 
<(®< — f) 


2Ve 


= p^|22oi-®<-f|< 






- I < € < + g|i 


where (is the true mean of the distribution. Thus (HaiXi — 1/2, SOiXi + 1 /2) 
is the desired confidence interval. 

In the above tests and confidence intervals, the distribution of the required 
number of observations, n, is 

^ Wo + 1 
B 




(13) 
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P{(n« - < (n* + l)(n, - P{xV» < »} 

1 f* _i„ 1,__ , 

du. 


~ (Vl)**"-‘r(i(no - D) I 
where v “ («o — 1) *A*i 


g-l» „«»«-» 


P{n “vj = P^v<~+l<i'+l^ 


(14) 


= PKr - l)(w« - l)«/<r* < < v(no 

1 


l)«/<r*} 


- D) L 


dtt, 


(■\/2)"* ‘r(i(no — 1)) 
for integral r > no + 1) all other values being impossible. Thus the expected 
number of observations, E(n), satishes the inequalities 
1 


(V 2 )**“‘r(i(no 

< E(n) 

(15) 


D) 


{/■(«. +1).-*-."--* + £.-*• *.} 




which can be rewritten 
2 


(Vi)--r(K.^ - 1) ) (f <”• + ■'“ +1 

•C-s^+>)*•)■ 


(16) 


(no + l)P{Xno-l < y\ + ■jP{Xno+l >-y\ 


< E(n) < (no + i)P{xno-i < y] +jP{xlr^i >y]+ Plxl^i > y}. 

Consequently P(n) is a function of and can be evaluated from tables of the 
incomplete T function. 

As mentioned in the introduction, these tests and confidence intervals will 
not be used exactly in this form, since they waste information in order to make 
the power of the test or the length of the confidence interval strictly independent 
of the variance. Instead of (2) we take a total of 


(17) n 

observations, and define 


niax^|~| + 1 


( 18 ) 


,no| 


U' + 


8 


Vn + 

\/n. 


Vn 
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By the same reasoninj? as that following (6), w' has the t distribution with rt© — 1 
degrees of freedom. By (2) 

(19) n > s^/b so that, although ^ - \/n 


is a random variable, 

(20) Vnl > 

Thus, if we use 


( 21 ) 


j / { ^ or / ^ tnQ—\,a 


instead of (8) or (10) resjx^ctively, we shall always increase the power of the test. 
Also the expected numl)er of observations will be mduced from that in (16) by 
< y}- Similarly if b is defined as in (12), the interval 



has length /, and the probability that it covers the true mean f is a function of o-, 
but is always greater than I — a, and differs only slightly from 1 ~ a if > 
'Jlius it can lx*, used instead of the confidence interval (12). 

From (16) it follows that 

iim \E(n) — “1 < I 
T-^co I a j ” 

|^(«) - ^ 

the approximation E{n) = o-'/a being fair provided cr* > Bn^ . The length of 
the confidence interval (12) is given by 


I ^ 04 ~ 2o’ino~l,a/2 

I - Vb ~ 

When the variance <r" is known, the length of the single-sample confidence 
interval of confidence (coefficient I — a obtained on the basis of n observations 
is given by 


1 f 


i.e., 


n/Zir 

1 — a = ~7^ I e~ 

"v 2 t J- t\/n/2v 

I = 2t ^,a/2<^l\/n . 


dx 


Since, even for moderate values of no , say no > 30, Uo-haii differs only slightly 
from /oe,«/ 2 , the expected number of observations for a confidence interval of 
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given length and confidence coefficient is only slightly larger than the fixed num^ 
her of observatioas required in the single-sample case when the variance is 
known provided the variance is moderately large. 

3. Distribution of a non-central F-ratio. In the extention of the above 
considerations to the testing of a general linear hypothesis, the power function 
depends on the distribution of a quantity ^ 


( 22 ) 


^' = S (9. - 

I 


where qi = , Xi i>eing independently normally distributed with mean 0 and 

v ‘f' 

variance 1, and r having the Xn distribution, independently of the r*. The 
Ci are real constants. 


Ijet 


(23) 

(24) 


1 1 

= I: (X, - c, Vrf - (f Vr |/|: e\j 


Now, (x, — is a quadratic form of rank m — 1 since the r, — Cif are 
1 

subject to one linear homogeneous restriction, namely ^ ~ 

i 

m 

Also f is of rank 1, and so that, by Cochran’s Theoi’em, % aod 

1 

f are independently distributed as xi-i xi respectively. Thus there exist 
!Ji ‘ Vm y independently normally distributed with mean 0 and variance 1 
such that 

(25) X = + yl 

• €-2 2 

f = yi. 

Vi 

I^t Ui = , Then the joint distribution of Ui • • • Um is given by 


^ Tl , * • * , 'Uro ^ "ffii 


1 _ 1 


X f e •'‘r’'*"” dr f ^ f dpt • • • dy„ . 

Jo A-00 J—op 


( 26 ) 
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The density function is given by 

< Ti, - ' , < rm} 


_ - __ f Kn- 2 ) Am j 

- (V^r(V2)"r(in)X ® " r e , dr 


1 


(27) 


Then let 
(28) 


■ (A/2i)” (A/2)"r(in) Jo 

( n \--J(m+n) 

_!+?'■) Th,,: 

(\4)"‘2*‘"'+"> r(in) Jo ® ‘ 

r(Kn + m)) A , A jV 
(>/;)"r(in) V + 4- ’’7 


fe-K ‘40 

•70 


,i{n+w— 2 ) 


dr 


-H‘>J(n+m- 2 ) 
—i(m+n) 


/ jl ^ 

“ V?" 


..2 -1. 

T = — = «2 + 


+ l4.. 


The joint distribution of i/' and t'^ is thus, by (27), 

PW < V, r'^ < r^] 

_ r(§(m + n)) r f /• A , V sV*'”"”’ , , 

“ (v'^)”r(4n) J J ■' ■ J V ? “7 rfwi • • • 


(29) 


Su*<r* 

i 

r(i(»» + n)) f f f I 2'!- 

(v^rnin) JJ - J 


i(wf n)+?^(nT-I) 


Sv*<T*/(l+Ui) 

3 


r(Kw» + »)) 

(\/^)"‘r(in) 




■i(m+») 


dui dy 2 dy„, 


//•••/ 


-i(n+l) 


Sv?<»‘*/(i+wf) 


(‘+?4 


i(»»+«) 


dUi dy2 dpm. 


In order to evaluate this integral, we use the fact that the distribution of a ratio 
of x»-i to x»+i > the two being independent, can be expressed in two forms, by 
(27) and WUks [2], p. 114, 

PlXm-i/Xii+1 <'1'] — rar«. _ j. j[, (t + p) dip 


(30) 


r(i(m - l))r(§(n + D) 
r(|(w + n)) 


(VT)’*-‘r(i(n + D) 


/ M / m—1 \-!(»»+n) 

■ ■ ■ J V ^ ? ®7 dqi-- - dq^. 


S t*<* 



A TWO SAMPLE TEST 


m 


so that 

PW < V, r'’ < r“l 

+ «)) 


(31) 


\/;r(jn)r(j(»t -1)) 

p r*/(l+ttj) 

X / / 

r(i(OT + n)) 


(1 + m !)-^'’*+‘ V *‘"^’(1 + 


\/irr(in)r(j(m -1)) 

X r f (1 + «*)“*<"+" f(1 + rz^'^ (1 + df du 

Ju-.<-oo •'f-0 \ 1 -T U / 

- /.',_ C + ”’ + 

Now we wish to find the distribution of 

= E (u - c,)‘ 


(32) 


E (*. - c.Vry 


X ^ (r - x^vuciy 

r r 


= r'= + - V2dr. 

Carrymg out the transformation (32), it is found that the joint density function 
of ij' and F' is 

pW, F') dV dF' 


r(i(w + n)) 




(33) 


VTr(in.)r(i(m“- f)) 

X [1 + u" + F' - (,' - dT,' dF' 

_ r(K>» + w)) ^ 

v^r(§n)r(Km-1))^'^ 

X [1 + F' + 2p\/^- + dp dF', 

where p = *>' — -s/^J. In order to obtain the distribution of F' we must inte¬ 
grate out p over —•\/F < p < \/F, obtaining 

p{F' < rt = <^».«(2; 12c?) 

r(|(w + n)) 


(34) v^r(in)r(J(»n - 1)) 
tVF‘ 


X I f " [F' - p’]‘'”^’[l + F' + 2p V^i + Sc?]-*'"-*-"’ dp dF'. 

Jf' mmO Jp>=m—y/T' 

In the case Sc»- = 0, (34) reduces to the distribution of the ratio Xm/x» • 
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4, Test of a linear hypothesis. In this case the power of the test usually 
employed is affected not only by the variance, but also by the values of the pre¬ 
dictors. In order to avoid this (hfficulty, it will be assumed that only a prede¬ 
termined number of different sets of predictors are used, and that these sets are 
repeated as a whole, as many times as is necessary. This covers, in particular, 
the replication of orthogonal designs for the analysis of variance. 

Let Viji i = 1 • • m, j = 1, 2, • • • be independently normally distributed 
with means 


(36) 


, M < ^, rank {Xki) = 

Jb-l 


and variance or*, the Xki being given in advance, <r* and a* unknowTi. We wish 

to test ^ Cinak — Cm , I — I • • • r < ju, where we may suppose equations 

(36) linearly independent, the cik being given constants. It will be convenient 
to reduce this to a canonical form, as in Tang [3]. First, by a non-singulai 
linear transformation 


(37) 

we can make 

m 

(38) 



Xki = hklBli 


e^i) = M X M identity matrix, 


any two sets of bki that accomplish, this being related by an orthogonal trans¬ 
formation. Then (35) becomes 


(39) 

and (34) l)ecomes 

(40) 


jfc-i y-=i 


M / M \ M ^ 

^ ^ ( 2!) I ~ ^ ^ki 7 

f-1 / k^l 


cio = 23 u* = 2^ Cj* 23 a« h^’' 

• ' ib—1 m-l 


fc-l 


m»l Af—l 


M 

I 


23 cLum, Z = 1 ••• r < M, 


where 6”^ are such that Xb”*^bki = Smh l^be Kronecker delta, or, in matrix notation 
(^Am)*^ = (6*”). Next, the equations (40) can be made into an orthonormal set 


ff 

C|0 


^ ClmoL 


( 41 ) 
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i.e., one in which 


ff ff - 

JL 0 ■“ »*/ 


by a non-singular linear transformation on the cim . Clearly Scio’ is an invariant 
of (41), i^e., it does not depend upon the choice of a particular transformation 
(37), or of a particular transformation of the cjm into , since, in both cas^, 
all admissible transformations are connected by an orthogonal transformatimi. 
Then we define 

m 

(43) y'is “ 53 i = 1, • • •,M 

tf-l 

(44) y'ii = t = *1 + 1, »» 

«-i 


fe) ‘ 


in such a way that I I is an orthogonal matrix which is possible, by (38). 


m m M 

Wa = 53 = 53 »«* 53 «*« <** 


9-.1 4-1 /b-l 

M m 

]C 53 = «< for » = 1, • • • , Mi 

fc—'I 4—1 

= 2 = S <f<« 53 **» «* 

4—1 4—1 ifc—1 

M m 

= 53 o* 53 diftkt = 0 .for t = M + li • • •, m. 

fc—1 4—1 


Finally we define 
(47) 


If I 

Vi) VHy 


?' = M + 1 • • ' , 


2 /i/ = S c<my«y, i « 1, • • • , r 


//*? — ^ ^ ^imVinj J i r “h 1, • • • , 


where the Cin are such that 




an orthogonal matrix. Since the transforma¬ 


tion applied to the to obtain j/,-, is orthogonal, the are independently 
normally distributed with variance <t“. Also 

(50) I<:y"i = 0, t = M + 1, • • • , ^ 

(51) Ey", = 2 i = 1, • • • , r 


(52) 


%<;• = £, CimCh 


t =* r + 1, • • * , 
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Since (50), (51), (52) were obtained from the original formulation by a non¬ 
singular linear transformation, the derivation can be reversed, which implies 
the equivalence of (50), (51), (52) to the problem as originally formulated. * 
Thus we can restate the problem in the following manner. Let t = 1, 
••*,<, i *= 1, 2, * • • be independently normally distributed with variance «r* 
and means 

(») 

Eyij = 0, i = /* + 1, • • • , and <r imknown. 

We wish^to test 

(54) Holii = 0,i = 1, ••• ,p < M 

the {,• for ^ = p + 1 * * • M and being nuisance parameters. 

Obtain a first sample yis,i — 1, = 1, • • • , n© . Estimate the vari¬ 

ance by 


M (j-i Wo t-i \j-i / ) 


Ijet s be a predetermined constant, and n be defined by 


n = max 


+. 1, Wo + 11 


After 8^ has been obtained, determine a set of real numbers, ai ••• On , in accord¬ 
ance with a preassigned rule, so as to satisfy 

2ay « 1 

(67) s“2af = 0 

= ’ ’ * ~ driQ . 


t (± 

has the non-central F-distribution given by (34) with n — — pL,m ^ 'p and 

(59) “ Z) fV(wo^ — m)», 

1 1 

where are the true means, allowing for the possibility that Hq is not true. For, 
(tM — has the distribution of Xno<->^> has been determined, 

n 

Z 1 • • * ?*, independently normally distributed with mean 0 

and variance <r*Sa* = so that, given s®, ““ {<^/Vs» t = 1 • p 



A rnro aAMn«E rsm 


indepencfently nomialiy distributed with mean 0 md varilmce But 

the random variables ti , in section 3 are of the form Xil^/r where tihe are inde¬ 
pendently normally distributed with mean 0 and variance <r*, while r/<r* has ^ 
distribution independent of the Xi . Thus U can be consider to have 
been obtained by first selecting a stochastic variable r such that tia has the 
distribution of Xno^ selecting to be independency normaHy dis¬ 

tributed, given r, with mean 0 and variance a^/r. Since r corresponds with 
(no< — /*)«*> comparing this with the above, we fmd that 


i—1 _ 

Vs \^not — M 


i = 1 • • • p 


have the same joint distribution as the . The 


V {net — 8)0 


are constants, so 


„ KS"'*'')’ , I i. 

e{not — m) [\/einot — m) ~ m)J 


<-i lV«(no^ — m) 


has the same distribution (34) as ^ (U — c,)* with c, = f</\/(no< — m)« • 

The tests of significance and confidence regions are obtained by a procedure 
completely analogous to that used in the case of Student’s hypothesis. If we 
define k = by 

( 62 ) F{Fp,nQt—i$ > Aj} = Of, 

then a critical region of size a for testing Ho is given by 


'r >k. 


Its power function is 


(64) = 

Similarly, a confidence region for {*, i = 1 • • * p, of confidence coefficient 1 — a 
is given by the set of all f» such that 


not — 


ip) < kf 


e(no< — m) 
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It ia evident that this defines the interior of the h}i)er^here 
(67) 

- < *»p 

whose volitme is independent of the variance <r*. 

The distribution of n, the required number of sets of observations for the 
above tests and confidence intervals is given by 

P{n «no + lj + 


(68) 

= P{(not — < (no + l)(not — it)e/ff*\ 


■ < 'I ■ ■(V5)‘r(w i 

where 


(69) 

V = (no + l){no< - M)«/(r* 

5 * no^ — M 

and 


P{n » y} = P < 

<“+!<!' + l} 


(70) = P{(y — < xl < 

for integral v > wo + 1, all other values being impossible. 

Thus E{n) satisfies the inequalities 

(71) <E(n) 

< (V2)W) I *■‘““‘^‘(^+0'*“}’ 

which can be rewritten 

(no + l)P{xl <!/)+" P{xl+* > v\ 

< E{n) 


< (no + i)P{x» < + “ P{x»+* > vl + P{x» > y]- 


(72) 





TWO MAHtPUl TEBT ' W 


The modifications required to avmd wasting information are exactly analogoas 
to those hade in the case of the test for Students hypothesis. 


6. Non oxistmce of a 8ingle«samgle test for a linear hTpofhesiB whose power 
is indepmident of tiie variance. The canonical form (see Tang @]) for a Hnear 
hypothesis in the single sample case can be derived immediately from (68) and 
(64). Let x<, i ==> 1 • • ■ n be independently normally distributed with means 

(73) Exi = Hi, i I • ■ • p 


Exi = 0, t = p + 1 • • • M 


and variance v’. The Hi and are unknown, and we wish to test Ht’.Ht 0, 
1 = 1 • • • p. 

The most powerful test for Hu against a givoi aiHiemative , i = 1 ■ ■ ■ p, 

if the variance a* is known, is that based upon ^ probability ratio (see Neyman 
and Pearson [4]) 


(74) ^ 


h 

Po 


1 _<•»-*<•)»+ 2 *?} 

r- \« e *»* U n-i > 


(V^g) 


-ib{| «»-* f 


(V&g) 


1 * 


Since any strictly increasing function of pi/po is equivalent for this purpose, 
we can use 


(76) 


ipixi • • • Xp) 


, {<oX< . 


The critical region of size a based upon ^ is given by 


(76) 

where 

(77) 


WoM « El 


? €<oX< 


’Vp- 


>“(> 


Va 


— r 




dx 


since, under Ho , ^ (ioXi is normally distributed with mean 0 and variance 
1 

p p p 

Under Hi , ^ {<oX< is normally distributed with mean 2 and 
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variance {<0. Thus the power of the test for the idtemative /fi as a func- 
1 

tion of a is 

1 - jSoW = t Wtia) I (i - (io, 



f i f?o 

Now let us suppose there exists a test based on the critical region W of size a 
whose power 1 — d is independent of <r®. Since TFo((r) is the best critical region 
of size a for any <r we must have 

(79) 1 - /J < 1 - M<r) = dx, 

80 that 

(80) 1 - ^ < gib. [1 - A((r)] = :^ jf* dx = a. 

By interchanging Hq and Hi we can reverse the inequality (80), proving 

(81) 1 ~ /S = a. 

Thus any singlensample test for a linear hypothesis whose power is independent 
*of the variance has constant power equal to the size of the critical region. 
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COMPACT COMPUTATION OF THE INVERSE OF A MATRIX 

Bt Fbedbbigk V. Wattgh and Pattl S. Dwtbk 
War Food Admini^aUan and The Univeraitu of Middgan 

1. latroduction. Among the most common applications of maUiematito to 
practical problems are the solution of simultaneous equations, the evaluation of 
determinants, and the computation of the complete inverse, (or the complete 
adjugate), of a given matrix. Even with modem computing machines these are 
laborious, time-consuming jobs. For that reason there has been great interest 
in recent years in the development of so-called ^‘compact'' methods; that is, 
methods that eliminate all unnecessary detail, that use computing machines 
to do as much of the work as possible, and that only require cop 3 dng the results 
needed in further analysis. 

In 1935 a paper by one of the authors [1] and since then papers by the other 
author [2], [3], [4], [5], [6] and [7] have outlined a variety of compact methods 
and have applied them to actual problems. These papers, together with other 
recent contributiAis, such as those presented in [8], [9] and [10], have resulted 
in much improved and more compact techniques in the general field of the solu¬ 
tion of linear simultaneous equations and allied topics, especially if the matrix 
is axi-symmetric. It is not generally recognized, however, that extension of 
these procedures (usually involving matrix factorization [7] [10]) can be used 
to compute the inverse (and adjugate) directly from the matrix factors without 
the necessity of the reduction of the unit matrix [11; 150] [2; 121] when the 
matrix is non-symmetric. 

The present paper extends the use of compact methods in three ways. 

(a) It presents a method of computing the inverse (and adjugate) of a sym¬ 
metric or non- 83 nnmetric matrix by compact Gaussian methods without the 
formal reduction of an auxiliary identity matrix. 

(b) It introduces the method of multiplication and subtraction with division— 
a modification of the method of multiplication and subtraction—and shows that 
the terms recorded in the compact solution are themselves determinants which 
are minors of the determinant of the matrix. 

(c) It uses the method of multiplication and subtraction with division as a 
compact means of computing the exact value of any minor of the determinant 
of the matrix (whether symmetric or non-symmetric). It further shows how all 
cofactors of order n — 1 (constituting the adjugate) can be computed from a 
compact presentation of the calculations of the determinant of the matrix. 

2. Gaussian methods and notation. Probably the method most generally 
used to solve simultaneous equations is the division method originated by Gauss 
[12]. Variations of this method are known as the Doolittle Method [13], the 
method of pivotal condensation [14], the method of single division [2; 104-112], 
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and the Grout method [8]. The methods as outlined by Gauss and Doolittle 
are applicable only to axi-symmetric matrices (common to least squares theory) 
while a more general presentation, applicable to npn-symmetric matrices as well, 
has been made by more recent authors. 

The compact form of this method, extended to apply to the non-symmetric 
matrix, used in this paper is as follows: 

Given the matrix 


( 1 ) 


we compute 


( 2 ) 


where 


(3) 

and in general 



On 

Ol2 

Oi8* 

* *Oin 


021 

022 

028* 

• •02n 

a = (Ork) 

Osi 

082 

088* 

•08n 


L««» 

0»,2 

On8’ * 

‘OnnJ 


Oil 

O12 

018 • 

* Oin 

621 

O22.I 

028.1 * 

* *02n.l 

bki 

622.1 

088.12* 

* *08n.l2 

6,1 

bfa.i 

6»8.12* 

• *6nn*12-»~] 


hrl = Orl/Ou 
U2*,i = 02*“” bnCik 
hr2.l = Orj briai2)/(hi.l 

Kz.U = (drZ hriUia — i>r2.l028.l)/U83.l2 


(4) 


12*••j—I 


Gyfc-12- ••3-1 OrjVi -• i-l 

ayy.12. -y-i 


^rjb*12*-*y 


Grib-12---y 
Gib* . 12- .y 


It should be noted that Grout's presentation [8] is similar to that used here 
except that Grout divides the elements of each row by the leading element while 
we ^vide the elements of columns. 

The notation used above, introduced by one of the authors [2], parallels that 
used extensively in multiple correlation and regression theory. It dijSers some¬ 
what from the notation used by Gauss. See [12; 69]. 

Since every h is the ratio of two a’s it follows that every b can be written in 
terms of a's so that the formulas can be written in terms of a's alone. This is 
what Gauss did although he used [ ]'s instead of a's. Gauss also used letters to 
indicate the primary subscripts and a single secondary subscript to indicate 
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the number of elimmationB. Thus our 022.1 was written by Gaw as {W>, 1] and 
011.12 appeared as [oc, 2]. > 

It is in the interest of less extensive notation and it ma^s our notation some* 
what closer to that introduced by Gauss if we replace 

OrfcU-.y by OrJb.(j) 

j by brk ^}). 


This shortened notation can always be used when the secondary subscripts 
include all the integers from 1 to j. In this modified notation the formulas (4) 
become 


( 6 ) 


OrJfc.(/) *= Or*.(y-l) 


brk-U) 


Orib-(/) 

dkha) 


Oyy.(/-i) 


3. Solution by matrix factorization. The values of matrix (2) are in general 
not final answers to proposed problems but they are values from which final 
answers can be computed. The matrix (2) exhibits essentially both the triangu* 
lar matrix of the ark-u) which we call t and the triangular matrix brk-o) which 
we call«. (The diagonal entries of the d matrix are all unity and do not appear.) 
Hence (2) is really « — 3? + t- 

A basic property, useful in most problems involving the use of (2), is that « 
and t are factors of a. Thus 


(6) a = «t and a — dt = 0. 

That this is true in the symmetric case was proved in an earlier paper [7; 86]. 
That this is also true for the non-symmetric case is now shown in a similar 
manner. 

Let ti be a matrix (n by w) with the first row composed of elements au and 
all other elements 0. Let be a similar matrix with first column elements bn = 

— and all other elements 0. Then a — «iti = ai = (ark.i) is a matrix (n by n) 

ail 

with all elements of the first column and first row^ 0. 

Next let t 2 be a matrix (n by n) with the second row^ elements 02 *.! and all 
other elements 0. Let a, matrix (n by n) with second column elements 
bn.i and all other elements 0. Then ax — = (ti ^ (ar*.( 2 )) is a matrix (n by n) 

with each element of the first two columns and first tw o row^s equal to 0. 

This process is continued through n successive steps, an additional row^ and 
column being made identically zero at each step. We have then 

(7) d ~ ““ ^t2 ~ — ^ntn ~ ~ 0. 

Now consider the triangular matrix 

t = ti “4" t2 “b t* i’ * * * *4" tn 
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with its rows composed of the non-zero rows of t. Consider also the triangular 
matrix + ^ + + Then = Wi + ^t 2 + • • v + ®i*tn since 

« 0 for i 9^ j; and (7) becomes 

a — «t = 0 or a «t. 


4. Gaussian computation of inverse (and adjugate) without formal reduction 
of auxiliary identity matrix. The inverse of a, « £ » (Crk) can be calculated 
directly from the matrices « and t of (2). The adjugate D *= (drk) can be. calcu¬ 
lated by multiplication by the determinant of the matrix and this can be calcu¬ 
lated by the well known formula 

(8) A == 0110221088.(8) • • • Onn.(n.l) . 

The theory is presented in some detail and illustrated for the case n = 4 after 
which a more general matrix presentation is given. The matrix equation 
a6 = 3 is equivalent to the following 4* simultaneous equations ih the 4® un¬ 
knowns (c,*): 


(9) 


Oil Cik + Oi2 C2fc -f Oi8 Csk + Oi4 dk 

O 21 Cl* + O 22 C 2 fc -4- O 28 Ca* -f- 024 C 4 * 

Osi Cl* + 082 C 2 * + 088 Ca* -1- 084 Cik 

O 41 Cl* + O 42 C 2 * + O 48 Ca* + O 44 C 4 * 


A;-l fc -‘2 fc -»3 fc —4 
10 0 0 
0 10 0 

0 0 10 

0 0 0 1 


Now since (Sa = 3? also we have a'E' = 3 and there results another set of 4** 
equations in the 4^ unknowns (cr*). 


( 10 ) 


Oil Crl + O 21 Cr2 + Oai Cr3 + G 41 Cr4 

O12 Crl -H O22 Cr2 + Os 2 Cr8 + O42 Cr 4 

O18 Crl + O28 Cr 2 + Oaa Cr 3 + O48 Cri 

Oi4 Crl + O 24 Cr 2 + G84 Cr8 + O 44 Cr4 


r«l r *2 r »3 r -4 
10 0 0 
0 10 0 

0 0 10 

0 0 0 1 


Fisher [11; 160] has shown that the equations (9) could be solved by reducing the 
unit matrix on the right... One of the authors has shown how to calculate the 
inverse of a symmetric matrix by Gaussian methods without reducing the unit 
matrix [1]. We now show how to reduce the non-s 3 anmetric matrix similarly. 
By the same process used in getting from matrix (1) to matrix (2), we can reduce 
the 4* equations of (9) to the 4* auxiliary equations below. 


( 11 ) 


Gll Cl* -f- ai2 C8* + 018 Cs* + Oi4 C 4 * 

0221 C2* + 028-1 Ca* + O24.I C 4 * 
088.(2) Ca* + 084.(1) C 4 * 
044.(8) Cl* 


*-1 Aj- 2 ifc-3 Ar-4 
* 1 .0 0 0 

*=*100 
= **10 
as- ♦ ♦ ♦ 1 


The terms marked * can be computed by the process. However if we do not 
compute these terms we have ten equations with the right hand terms either 
1 orO. 
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m 

In rinaUar way the 4“ equationa of (10) can be reduced to. ^ 4* auxiliary 
equations below. As above we may neg^t the cal(ndati(m of diagonal 
tenns, and of all terms below the diagonal, and still have six equatiito (with 
terms on the right zero). 

f*l r-2 r*-3 ^-4 

Crl + 621 Cr2 + &81 Crti + 641 Cr4 = * 0 0 0 

(12) Cr2 + ?>321 CrZ + ^>421 Cr4 = * * 0 0 

CrZ + b48-(2) Cr4 = * * * 0 

C.4= * * * * 

The ten equations of ( 11 ) with the six equations of ( 12 ) are sufficient for de¬ 
termining the inverse matrix. Solve ( 11 ) for A? = 4; then solve (12) for r 4; 
then solve ( 11 ) for A? = 3; then solve (12) for r = 3; etc. Each equation can be 
solved completely on the machine to give a value of a Cr* . 

It should be noted that Gaussian methods are approximation methods since 
they are division methods. For a discussion and treatment of the errors re¬ 
sulting the reader is referred to papers by Hotelling [9] and Satterthwaite [ 10 ] 
to which further reference is made in the next section. 

Different forms for presentation of the results may be used. We suggest 
the following form which presents first the matrix ( 1 ), then the terms of the 
matrix ( 2 ). The terms of the matrix S' are then computed by ( 11 ) and (12) 
and placed diagonally adjacent to the terms of (2). The transpose of S is used 
so that the check multiplication by a may be most easily accomplished. The 
result of this multiplication which next appears shows that the computed value 
of a is correct to three places. The final matrix of Table I gives the value of 
the adjugate, 3 ), as found by multiplying each element of the inverse 
by (26)(52.308)(39.356)(43.071) = 2,305,300 (to five places). 

It is possible to check the accuracy of the entries of each row and column 
of the matrix ( 2 ) separately by using a check sum to the right of each row and 
at the bottom of each column. We have not taken the space to show check 
sums and they are not particularly needed after one gets a little practice with 
the method. In any case aa”^ should be computed as a final check. 

A more general matrix presentation results from the use of ( 6 ). The matrix 
equation affi = 3 becomes «t (5 = 3 and hence the auxiliary equation becomes 

(13) t 6 = 

Now since ^ is triangular with imit diagonal terms and zeros above the diag¬ 
onal, it follows that also has unit diagonal terms with zeros above the. diag¬ 
onal. Hence we can select ^ equations from the n* equation of (13) 

which demand no further knowledge of the entries of A similar treatment 
of the matrix equation a'G' *= 3 , t'4'C' = 3 and 

( 14 ) «'®'« 

♦t frt 1 ^ 

yields —5 - - equations involving zero terms of (t')~\ These two sets of 



264 


FREDERICK V. WAUGH AND PAUL S. DWYER 


equations taken together in the proper order are sufficient for calculating the 
values in the inverse. 

It may be of interest to note that this is also a procedure for calculating 
when t and a are known without the calculation of and separately 

since 

(16) C « 


6 . The method of multiplication and subtraction with division. We now 
present a different method, based upon the work of Hermite [15] and Chid [16] 


TABLE I 

Suggested form for calculation 


26 

-10 

15 

32 

19 

45 

-14 

-8 

12 

16 

27 

13 

32 

29 

-35 

28 


26 

-10 


15 


32 



.02873 


-.00696 


.01825 


-.00283 

.73077 

.02436 

52.308 

.01239 

-24.962 

.01440 

-31.385 

-.02267 

-.46154 

-.02302 

.21765 

.01572 

39.356 

.00791 

34.600 

.01991 

1.23077 

- .01519 

.78970 

.00419 

-.85753 

-.02041 

43.071 

.02322 


1.000 


0.000 


0.000 


0.000 


0.000 


l.OOO 


0.000 


0.000 


0.000 


0.000 


1.000 


0.000 


0.000 


0.000 


0.000 


1.000 


66231 

-16045 

42072 

-6524 

56157 

28563 

33196 

52261 

-53068 

36239 

18235 

45899 

-35018 

9659 

-47051 

53529 


together with important modifications suggested by the work of Dodgson [17]. 
Current presentations of the basic method include the “method of condensation'^ 
[18; 45-48] and in compact forms, the “method of multiplication and subtrac¬ 
tion" of one of the authors [2; 197-202]. 

In Gaussian methods we divide each element of a column by the leading 
(diagonal) element of that column. In the method of multiplication and 
subtraction we use the leading element as a “pivot" forming a number of two- 
rowed determinants. Thus we use the leading elements as multipliers rather 
than as divisors. No divisions are made in this method. This is a very real 
advantage when the elements of the original matrix contain only two (or three) 
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digits each and when n < 7 (or 5). In such cases we can use this method to 
compute deadly the values of any minor of the determinant of the matrix and 
even the adjugate itself. 

It is perhaps well to mention here that error control is difficult with division 
(Gaussian) methods. Even if many significant places are carried the ezTors 
may be significant, cumulative, and difficult to measure. The techniques 
suggested by the papers of Hotel^g [9] and Satterthwaite [10] are most useful in 
developing error control in matrix calculation. However, where accuracy is 
important, and when the number of digits is not excessive, there appears to be 
merit in calculating the exact values. 

In the method of multiplication and subtraction, we compute from the matrix 
(1) the following matrix 



an 

Oi2 

Ois 

• Oin 


021 

Afi.i 

^28-1 • • 

• Atn-l 

(16) 

Osi 

dc-i 

Au (.2) * • 

* Azn-W 


La„i 

Ata.i 

dn8*(2) • • 

* dnn*(n—D- 


where 


(17) 


^rk-l “ UllUr* Qlkflrl 
Ark-(.2) * Ajf.idrilr.i — A2k-lAr2-l 


and in general 

Ark-(J) » 

This notation is similar to that used in connection with Gaussian methods above. 

In the method of multiplication and subtraction with division, we compute 
from the matrix (1) the following matrix: 
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Id general the method calls for the calculation of entries according to the 
method of multiplication and subtraction but in addition calls for the division 
by the leading element of the second preceding row or column. Since this 
division must be exact, as is shown in the next section, we have at each stage 
a good numerical check on the work as well as an exact value of the entry. Fur¬ 
thermore it is shown in the next section that the value of Brk-{j) is the exact value 
of the determinant 


ail 

012 

018 • 

• Oi/ 

aik 

021 

022 

028 • 

• 02/ 

02Jfc 

Osi 

082 

088 • 

• 08/ 

Oafc 

0,1 

0/2 

0/8 • 

• 0// 

Cljk 

On 

Ur2 

OrS ’ 

* CLrj 

ttrk 


All the recorded entries (themselves values of determinants) are calculated on 
the machine. The only limitation is the number of places the machine provides. 
For the trivial problems (composed of small integers) found in most texts of 
College Algebra, one can calculate the values readily without machines. For 
example the determinant 


2 

1 

-3 

4 

2 

1 

-3 

4 

3 

2 

2 

1 

3 

1 

13 

-10 

2 

-1 

1 

3 yields at once 

-2 

0 

-2 

7 

4 

-3 

2 

1 

4 

-10 

73 

-397 


and the value of A is —397. All the other entries are also minors of A. 

Dodson introduced a method of multiplication and subtraction with division 
as early as 1866 [17]. He however used a moving pivot. For our purposes it 
seems preferable to use a fixed pivot as we suggest in this paper. 


6. Proofs of theorems involving the B rA O)* 

(a) First theorem. We first prove that the numerator — 

Bjk’(^i)Br}^(j^i) in the definition of Brk^^j) is exactly divisible by the denominator 
/-.!•(j^ 2 ). To do this we expand the terms of this numerator of (20) with 
the continued use of 


( 22 ) 


B, 


B 


rk‘(f-~l) 


J-lJ-l’U-i) Brk ^j-2) 


- ^y-1, 


B, 


Bi-% 


•y-2c/-^) 


(which is (20) with j replaced by j — 1) and then we multiply and cancel. It 
is found that is a factor of all non-cancellable terms so the exact 

divisibility is proved. 

(6) Second theorem. We next prove that Br*. (/) is the value of the determinant, 
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(21). We illustrate first for j ■= 3 and then give a more general proof. ’ViTjen 
j « 3 


an On an an 
On On (h$ Oik 
Oil On On asi 
Ofi flrS OrB Ork 


an an an an 

JL 0 B22.1 BfB-i Bu‘i 

ah 0 Bjj.i Bbb.i Bzk-i 

0 Bft,i Brl-l Brfcl 


^ Bni Bjg.i Bbux 

^ B«.j Bni Btt.i 

D D 

x5r21 -Dff-1 -Dr*! 


^ Bj8-(2) 

Bjf.i Br8.<2) 


In the more general case we designate the determinant (21) by | Or* | and reduce 
the order by the ^^condensation” method just illustrated. It is understood 
that the values of Bri .- used in the following proof have primary subscripts 
larger than secondary subscripts since the rsmk of the resulting determinant 
decreases with each condensation 


1 Uffc I = ^j-l 1 Bfk \ I = py-2 I BrJb.(2) 

an *022-i 


B/-.j,y_i.(/_2) 


1 Bf*.(y.l) I = Bfk {^ 


It is to be noted that the first theorem, since each Brfc.(y) can be interpreted as a 
determinant by the second theorem, is a corollary of a well known theorem 
[19; 33]. In a conventional determinantal notation it might appear as 

(24) AAy*;ry = ArifeAyy ~ AryAy* 

where the first subscripts indicate deleted rows and the second subscripts deleted 
columns. 

(c) Third theorem. We next relate the values of Brfc.(y) and the values Ort-o) 
and hrfc.(i). With the use of the second theorem (23) and (8) we have 

/OK\ Brk’(j) _ flu 0221 a«8.(2) *** Ork-ii) __ D 

(25) == --- « 

Ork-d) Orkd) 

and with the additional use of (4) 

^rk(J) __ «1iU221UM(2) *•’ _ D 

T - = - = -OM cy). 

Ork-d) Ork-d) 

akk-d) 

These formulas may be written in the form 

Brk •(y) * -®yy-(y~i)<3trfc.(y) 

Brk’(J) = Bkk-(J)brk-(/) 


(27) 
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and since and are diagonal terms, it follows that the matrix 

(18) can be obtained from the matrix (2) by multiplication by diagonal matrices. 

(d) Fourth Theorem. A fourth theorem gives explicit matrix formulation 
to these results and shows how the values of the matrix (18) can be used in 

factoring the matrix (1). Now (27) and (28) can be WTitten in the form 

(29) Z = aWrt 

(30) '£ = 

where SWr is the diagonal matrix which multiplies t to get Z and fffls is the 
diagonal matrix which multiplies ^ to get £. The values of the Z matrix are 
the values of (18) with r ^ k wliile the values of the <S matrix are the values of 
(18) ^vith r ^ k. The diagonal matrix SHr is composed of diagonal elements 
[1, On, J522.1 • •• (n- 2 )] whilo the matrix 3H, is composed of diagonal 

elements [on , iljs i, ^ss ( 2 ) * * * -finwcn-i)]. The basic matrix factorization equa¬ 
tion (6) then appears as 

(31) a = aHrWsa:. 

It is to be noted that exact \^alues of elements of all them^ matrices are avail¬ 
able if the inverse diagonal matrices are written in fractional form, subject of 
course to practical limitations such as number of places of computing machine, 
etc. 


7. Computation of the adjugate matrix. We now present matrix formulas 
which enable one to compute the adjugate of a compactly with the method of 
multiplication and subtraction vith division. If (9) is the determinant of a 
and 3? is the adjugate of a, we have 

aZ 
tZ 

mz 

(32) ZZ 
and similarly 

a'Z' = 1 a I 3 
t'^'Z' = I o I 3 

^'3?' = I a I (t')"' 

9K;«'3)' = I a I (t'r' 

(33) = SK. I o I 

The computational procedure in getting the adjugate is veiy similar to that 
used in getting the inverse in section 4. Z and @ are triangular matrices while 


= |a|3 

= |a|3 
= I a I 

= 2R, I a I 0'* 
= I o I 
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and are the matrices used before. The values of «u, Bai, • • • 
Bn.-i,n-a (n-4>l SWJau, Baa^i, Baa-w > • * * Bn».(n-.i)] aud | a | are first computed 
by (18) so that SWt | a | and SKf 1 a j can be calculated. Without further calcula¬ 


tion We are able to select 


n(n + 1) 


equations from the matrix equation (32) 


having kno^vn coefficients on the right -.- of which are zero^ and — 

equations from the matrix equation (33) having zero coefficients on the right. 
These constitute the w* equations necessary to determine the n* values of drk • 
These values of drk can all be calculated directly on the machine and, what is 
more useful in discovering calculational errors, the divisions yielding the drk 
must l)e exact. 

For n = 4 these n“ ecjuations are 


km. 2 A ;-3 ifc -4 


«ii dik + ai2 da* + ai8 d^k + Uu dik 

B22.1 dajb + B28.1 d%k + B 24 a dik 
Bia.(2) dik + B84.(2) dik 


ttll drl + 021 dri + 081 drZ + O41 dfi = 

B22 I dr 2 + Baa l drl + B42 I dri = 
Baa-(2) drZ + B48 (2) dri = 


0 

0 

B 22.1 |o I 


0 

0 

0 

Bis.(2) I a I 


1 r - 2 


The process is similar to that of section 4. An illustration for the case n « 4 
is given in Table II. The matrix of the B*s is directly l)elow the matrix a and 
the calculated values of the elements of jD' (obtained by soMng (34) and (35)) 
ai-e placed diagonally in the cells ^vith the B’s. The values of the transpose of 
X) are used so that the check, premultiplication by a, is easily carried out. The 
next matrix in Table II exhibits a® = 1 o 13- The last matrix of Table II 
is a five decimal place approximation to (S' which is obtained by dividing the 
entries of T' by | a | . Since we know these aie the correct five decimal place 
A'alues of (S', we may compare the corresponding values of Table I to see how 
much those are in error. It should l)e noticed that the approximation to (S' may 
l)e readily carried to more than five decimal places if desired. 

A& with the Gaussian methods, it is possible here, also, to check each row 
and column individually by using check sums. 

l^'he work necessary for the (!omputation of the adjugate from the matrix of 
the B’s can be shortened somewhat by the use of the fact that the adjugate is 
composed of the cofactojs of the ark • Now the cofactors of the four terms in 
the lower right hand corner are =* Bn_i.«.-i.u-. 2 ) ; * ~B„>.i.n.(n~ 2 ); 

dn.n-i = —B»,„-i.(„- 2 ) ; and dnn *= B^tn.(„_ 2 ) and these are available from the 
calculation of the B's though B„„.(,^_ 2 ) is not recorded. (See the lower right 
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four entries of the B’s and a^s in Table II above). With these four values 
immediately available, the use of but — 4 additional equations is demanded, 
or this additional information can be used in checking. 

TABLE II 


Siiggested form for computation of adjugate {with check) and then inoeree 


26 


-10 


15 


32 


19 


45 


-14 


-8 


~12 


16 


27 


13 


32 


29 


-35 


28 


26 

66233 

-10 

-16033 

15 

42069 

32 

-6503 

19 

56151 

1360 

28558 

-649 

:13194 

-816 

-52258 

-12 

-53068 

296 

36236 

53524 

18224 

47056 

45899 

32 

-35013 

1074 

9659 

-45899 

-47056 

2305327 

53524 


2305327 


0 


0 


0 


0 


2305327 


0 


0 


0 


0 


2305327 


0 


0 


0 


0 


2305327 


.02873 


-.00695 


.01825 


-.00282 


.02436 


.01239 


.01440 


-.02267 


-.02302 


.01572 


.00791 


.01991 


- .01519 


.00419 


-.02041 


.02322 
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MULTIPLE MATCHING AND RUNS BY THE SYMBOLIC METHOD 

Irving ICaplansky and John Riordan 
New York City 

1. Introduction. The two subjects in the title have generally been treated by 
distinct methods, an excellent summary of which is given by S. S. Wilks in 
Chapter X of [13]. For two-deck matching, an appreciable simplification over 
the classical work of MacMahon [7], which seems to underlie the generating 
function used by Wilks [12] and Battin [2], has been shown by one of us [5] 
to follow from symbolic methods. Here we give an elaboration of these methods 
to multiple matching and to runs. 

The basis of the symbolic method in both problems has been given in [6], 
but for completeness a skeleton resume is given in Section 2 below. A new 
point is stressed: the relation of coefi^cients in polynomials of the symbolic 
method to factorial moments (cf. Frdchet [4]). 

The emphasis for the most part is on showing the expedition of the symbolic 
method in reaching known results, but in several instances new results are 
obtained. 

2. Symbolic expressions and moments. Let Ai, * * * , An be arbitrary events 
and let , • • • , A**) denote the joint probability of A<i, • * • , A<j^ ; let 
Pr be the probability that exactly r of the events occur. Then 

(1) Pr - E (-ir*Cr2(-l)‘p(^,. , • • • , il J 
and in particular 

or symbolically 

(2) Po = [1 - p(Ax)][l - p(A2)] ••• [1 - p(An)]. 

The cases to be studied will be exclusively ones where so-cafled guasi-symmetry 
holds, i.e., p(A,i , • • • , A^) is either 0 or a function of k alone. In that 
event (2) can be evaluated as follows: suppress all products that vanish, and 
form a polynomial f{E) by replacing each surviving term p(A<) by E, Then 
Po “ f{E)<i>o where P is a displacement operator: . 

The same polynomial f(E) can also be used to obtain Pr and the moments of 
the distribution. From (1) we see that Pr * f(E)^o , where *= (—. 
Again it is well known (Fr^chet [4]) that the k-th factorial moment, defined by 

Mrt,••• (t-* + i)P*, 
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is also given by 

Jlif(jb) = klXp(Ai ^, • • • , A<j), 

It follows that the terms of f(E)^ are essentially the factorial moments. More 
precisely, if 

then 

(3) M(k) ** . 

3. Card matching. To avoid complications which add nothing to the funda¬ 
mental idea, the case of three decks will be considered explicitly. As remarked 
by Battin [2], there is no loss of generality in supposing that the three decks 
have the same number of cards: let them be numbered from 1 to n. Let pijk 
denote the probability that the t-th, j-th, and ib-th cards of the three decks are 
matched, that is, all occur in say the Irth place. The condition of quasi-sym^ 
metry is fulfilled, the (symbolic) product of k of the p’s being either 0 or * 
[(n — k)\/n\f. 

The simplest problem is to find the probability that there be no triple matches 
of the form (i, t, t). Since no products of the expression 

(1 - Pm)(l - Pw) ••• (1 - P««n) 

vanish, the answer is (1 — in agreement with Anderson [1] (cf. also 

problem E 589 in the Ammcan Mathematical Monthly y p. 512,1943; solution 
by John Riordan, p, 287, 1944). 

Suppose now that the decks are given compositions in the usual fashion by 
having ai, &i, ci aces respectively, Os, 5s, cs deuces, etc. We may number the 
cards so that 1, • • • , ui are aces, Oi + 1, • • * , Ci + Os are deuces, and similarly 
in the other decks. The probability of precisely r matches among cards of the 
same denomination is then given by 

(4) E{a \, 5i, ci)F(a8, 52, Cs) * • • > 

where 

F(a, 5, c) = n(l - piik) 

the symbolic product being taken over ranges i = 1, • * • y a, j = 1, • • • , 5, 
fc = 1, • • • , c. 

A simple combinatorial argument reveals that 

(6) F(o, 5, c) « X,(aUbUcU^Ey/tl 

where (a)« « a(a — 1) • • • (a — < + 1) is the Jordan factorial notation. The 
problem of matching arbitrary decks is thus compactly solved by (4) and (5). 
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4« Eacamples. Wh^ decks of explicit structure are in question, the com¬ 
putation of probabilities and moments reduces to straightforward algebra, as is 
illustrated in the three following examples. 

1. Suppose each of three decks has two suits of two cards each. Then, since 

F(2, 2, 2f = (1 ~ 8jE 7 + 4^7*)' = 1 - 16jE7 + 12E^ - 64E* + 16J5*, , 

it follows that 

(4I)*Po = (4!)" - 16(31)'' + 72(21)'' - 64(11)® + 16(01)® 

= 576 - 576 + 288 - 64 + 16 == 240, 
and the calculation of (4I)®Pr may be set forth as follows: 
r 

0 576 - 676 + 288 - 64 + 16 = 240 

1 576 - 576 + 192 - 64 = 128 

2 288 - 192 + 96 = 192 

3 64 - 64 = 0 

4 16 = 16 

each column being obtained by multiplying its first row entry by a binomial 
coefficient. These results may be verified readily by direct enumeration. 

2. In the case of three 5 by 6 decks, the polynomial is 

P(5, 5, 5)® = (1 - 125P + 4000P® - 36000JS;® 

+ 72000P^ - 14400£;®)‘' 

= 1 - 625P + 176,250P® - 29,711,250P® 

+ 3,346,063,125P^ • • • 

The factorial moments can be obtained using (3). 

ilfci) = 625/25® = 1, 

M(2) = 2-176250/25®-24® = 47/48, 

M(8) = 7923/8464, 

= 1784567/2048288, 

the first two in agreement with Battin [2], 

3. The symbolic method can be applied to more intricate kinds of matching, 
as this final example shows. Suppose that the six matches represented by 
(123) and its permutations are forbidden, likewise the six matches represented 
by permutations of (456), and so on in groups of three. Then 

(1 — Pl 23 )(l — Pl82)(l — Pai 8 )(l — p28l)(l — P*ia)(l — PWl) 

= 1 - 6® + 6i?* - 2E\ 
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and so the answer is 

(1 - + 6JE* - 2Ey^\ 

The analogous problem for 4 decks has the solution 
(1 - 24£; + 108E® - 96JE;* + 

The generalization to an arbitrary number of decks involves the enumeration 
of Latin rectangles, in itself a formidable problem. 

5. Moment formulas. It is possible to deduce from (4) and (6) fairly e9q>liGit 
formulas for the factorial moments. Let us define = (a)i(h)i(c)t. Then 

(5) may be written s 3 rmbolically as 

F(a, 6, c) = « exp (-uE)'. 

Writing F(ai , , Ci) = exp {—UiE), we then have 

Po = exp [-(til + tia + • • •)E]<lHi 

= S|(tii + Wa + * • •)* ^ 

or finally, if m + 1 decks are being matched, 

(6) Po = M-y(ui + U 2 + --y/tl (n)7. 

It is to be borne in mind that after expansion of (wi + tia + • * •) * by the multi¬ 
nomial theorem, the term uiulul •••is replaced by ••• with the 

w’s defined as above. 

By (3), factorial moments corresponding to (6) are given by 

(7) M(^t) = (wi + tia + ’’OVWr. 

Thus in particular 

n^M(X) = til + tia “f* * • • • 

n*”(n — 1 )”*M( 2 ) = (til + tia + • * 

= Xiaiitti - l)bi{bi - 1) • • • + 22;,y/a<oy6<6y • • • 

the cases m = 1, 2 in agreement with Battin [2]. 

In the simple case where m = 1 (two decks), = 6, = a and n = «a, we have 
ti^'^ = (a)J and 

(8) (w)ii»f(o = (ti + ti+ ••• ti)^ 

with sti’s in the parenthesis. The right of (8) is the multi-variable polynomial 
of E. T. Bell [3], Y tiyi , ya, • • • , y«) with « (s)ti^*^ and («) a symbolic factorial 
such that y^s — etc. Instances of (8) may be compared with 

Olds [9]. 
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Expanding (8) we obtain 

= - D" + • • • 

and, since {8)t/(n)t oT* as w oo, it follows that M(t) a\ i.e., the limiting 
distribution is Poisson with mean o. As indicated in [6] one may proceed to 
obtain successive terms of an asymptotic series for the distribution. These 
results generalize to the case where M^) = ^a^hi/n approaches a finite limit as 
n 00 . In certain instances where M(i) —► oo, asymptotic normality can be 
proved (cf. [1] and [8]). 

6. Successions and runs. As shown in [6], enumeration of permutations with 
a specified number of 2-succes8ions like 12, 42, • • • may be accomplished by 
introduction of symbols like denoting probabilities that 1 immediately 

precede 2, 4 precede 2, resp. For permutations of objects Oi of which are of one 
kind, Os of a second, • • • with Oi + Os + • • o. = n, the probability of exactly 
r 2H3Uccessions is ([6] p, 914) 

(9) Pr = 0(ai)(?(as) • • • 
with = ( —l)JCr(n — A;)I/n!and 

0(a) =i:(aMo 

t-o 

It is to be noted that in deriving (9), elements of the first kind are numbered 
1 to Oi, of the second Oi + 1 to oi + Os , • • • and a succession occurs if either 
i precedes j or j pi*ecedes i mth i and j in the same set. 

For 5 = 2, i.e., two kinds of elements, there is a simpler formula due to Stevens 

[10] , but for the general case (9) seems to be the only reasonably explicit solution 
known. In particular, for the function F(oi, • • • , a,) of Mood [8] which enu¬ 
merates the nunil)er of permutations with no 2-8Ucces8ions, we have 

F(oi, ... , a.) = nlGiai) * • • G(a,)4>o . 

Factorial moments for 2-succe8sions are given at once by (7): 

(10) ^^(t) = (wi -|- 1^2 + * • • “b 
with Ui^ = (a,)/a,- — l)j’. 

It is more usual to classify permutations according to the number of mns, 
say r', a run consisting of a succession of i like elements (i = 1, 2, • • • )• Since 
every 2-succession causes the loss of a potential run, we have r' = n — r, i.e. the 
number of runs is n diminished by the number of 2-8Uccession8. Factorial 
moments Mo) for runs are then given by the usual formula for change of origin: 

(11) J&(0 = E (- - »■).-* ■M’co. 

t—0 

Examples, 1. Introducing a, for the t-th elementary symmetric function 
of the a’s, 
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ai »= ai + oi + • • • a, « n, 
os = aiOs + ai<H + • • • + a«^ia« f 
as = 010208 + • • ^ , 

we may derive from (10) and (11) the formula 

(12) fi(i) * 1 + 2at/n 

for the mean number of runs. The variance <?•*, the same for runs and 2-succe6- 
sions, is given by 

(13) <r‘ = Mm + Mm -M\»= ^ i) 

For runs of two kinds of elements, formulas (12) and (13) specialize to those 
given by Wald and Wolfowitz [11]. 

2. For runs of elements of a single kind, factors in (9) pertaining to other ele* 
ments are suppressed. Thus if o is written for Oi, and terms in 02 , • - • , a« are 
suppressed, (9) and (10) become 

Pr-G(o)^o, 

M(t) = (a)t(a - l)t/(n)t . 

Moments for runs are given by 

M(i) = 21 (—l)*iC<(n — = (a)f(n — a + l)«/(n)i 

t-O 

in agreement with Mood [8]. 
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ON THE POWER FUNCTIONS OF THE E^-TEST AND THE r*-TEST 

By P. L. Hsu 

National University of Peking 


1. The general linear hypotiiesis. Every linear hypothesis about a p-variate 
normal population or several such populations having common variances and 
covariances is reducible to the following canonical form [4]: The sample distri¬ 
bution, when nothing whatever has been discarded from the whole sample, being 

I ,««+») ^ J g 

( 1 ) iVir - ntr)(ysr “ Vjr) - J S 2 D dy (fe 

(n > P), 

where the tjir and the are unknown, the hypothesis to be tested is 

H: 97t> = 0 (i = 1, • • • , p; r = 1, • • • , wi, ni < m). 

It is clear that the y<r (t = 1, • • • , p; r = ni+1, • • • j m) can have no use. 
Also, the only useful quantities supplied by the set zu are the statistics 

A 

hij =* ^^ZuZ^if 

•-1 

because the remaining quantities may be regarded as a set of angles which are 
independent of ytr and the 6,/ and which has a known distribution free from any 
unknown parameter in (1), [2]. After discarding the irrelevant y*8 and the angles 
there results the reduced sample distribution 

K I 1 h, 1“"-'-“ exp (- J i 


•it (.t/ir - VirXVir - Vfr) - i 2 H ^ 


Hereafter the indices i, j and r shall have the following ranges: 

i = 1, • • • , P, r = 1, • •; , ni, 

and the convention that repetition of an index indicates summation will be 
adopted. Writing 

dij = yiryjr f Cij = aij + bij , 
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In the re mai n i ng two sections of this paper we deal excliisivdy witii the 
special cases p « 1 and ni = 1. According asp = lorni*=lwe shall drop 
the indices i and j or the index r. 

The case p =» 1. When p = 1, (2) reduces to 

exp (—iac + ayrfjr — iaifriir) dcUdy. 

Putting pr *= c^Xr we obtain 

(3) - x^.)^”~'exp (-Jac + aC^XrVr ~ iatlrVr) dcJJdx. 

The hypothesis H is now 

-fiT': i?r = 0 (r = 1, • • • , ni). 

If It? is any critical region for the rejection of denote by w(fi) the cross 
section of w for every fixed c. Then the power function of is 

Mvy ot) = fiwiflii • • •, 1?1I1, a) 

e-*""'"' def (1 - dx. 

Jo Ju>io) 

It is known [3] that, in order to have 

(5) 0w(O, a) = € 

for all a, it is necessary and sufficient that 

(6) f (1 — XrXr)^*'^^Il'dX = Ac, 

JwCe) 

where A is a constant. 

The E^-ie&t is the test based on the critical region 

too : XfXr = c^yrVr = > const. 

The author has proved [3] that of all the critical regions which satisfy (5) and 
whose power function is a function of aiirVr alone, the region tOo is the uniformly 
most powerful one; This result is generalized by Wald [7], who proved that, of 
all the regions satisfying (5), the surface integral 

ywi<x, X) = / Pwivt oi) dA 

is maximum when to is too. The author gives here another proof of Wald’s 
theorem which is easier as it dispenses with the somewhat intricate Lemma 1 
of Wald. Prom (4) we have 

7»(a, X) - /!:«*'"•+*> dc 

•I (1 — dx I exp (-^ioirjrfjr + OtC^XrVr) dA, 

Jw(a) J^rnr^ 



280 


P. L. HSU 


By means of a rotation in the space of (f^i, • * * , we can obtain 

J r exp (—Joijrljr + aC^XrHlr) dA 

= I exp (-iafrfr + ei(C*(aV®,)*fl) dA - ]C 0*«“(c»riEr)*, 
•'f,rr-X t=0 


where a» dependH only on a, k and X. Hence 

(7) y„(a, X) = £ 6* dc f (ir*r)*(l - avir)‘"“*n dx, 

it-0 •'O •'w(c) 

where 6* depends only on fc, a and X. Since w{c) satisfies (6), it follows from a 
lemma of Neyman and Pearson [5] that 

f (a;ra;r)*(l — cte 

JiBie) 


is maximum, for all c and k, when w(c) is the region XrOCr > const., i.e. when w 
is itself the region XrXr > const. This proves Wald’s theorem. 

Still another optimum property of the -K^-test may be established on using 
the volume integral instead of the surface integral. This is stated in the follow¬ 
ing theorem. 

Theorem 1. Let S he any linear set and let 

V>w(a, S) = f cr)n drj. 

Of aU ike regions satisfying (5), the region Wq has the maximum S), 

For, by the same computation which leads to (7), we easily obtain 

«>«(«, -S) = Z c* r dc f (xrXrfil - x,Xr)‘”"‘n dx, 

kmmO Jo JwCe) 

where c* depends only on k, a and S. Hence the result follows. 

This theorem also contains my previous result as a consequence. For, writing 

/3v,(v, «) = fiot7trVr)y «) = MoiVtlJr), 

we have 

0 < / ^ (/o(ocijr1?r) - f(otflrrjr))U dfj = ““ /(«0) dt 

Since S is arbitrary, we must have f(at) < /o(orf). 

The case ni = 1, When ni = 1, (2) and H become respectively 

exp (—+ ai^irii — iaiflivDUdy dc, 
H": - 0 (t - 1, . • , p). 
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There is a unique real matrix 


tix 

in Ua, 

Up * ipp J 


(^*« > 0; zeros above the principal diagonal) 


( 10 ) 


such that [Cii] * TT'[2], Introducing the new variables xi, ■ • • , x, by means 
of the transformation 

(0) [l/i, * • • , 1/J = [xi, • • • , Xp]T 

with the Jacobian j T | « | c*/1* we obtain the distribution 
fix, c)ndxdc = Kl an I Cii 

•exp i—iaifin + ai^k^kVi — iat^ifiDlldx dc 
(A; = 1, • • • , p; « 0 when k > i). 

If tr is any region, we write 

Ap(i 7, Of) * ••• fVpf «u, «i 2 , • • •, app) *= ^/(®> c)ll ^ ^9 

so that ffvirt, a) is the power function if w serves as a critical region for rejecting 
//". We have, symbolically, 

w ^ D X wic), 

where D is the set of points (c^) for which [ctvl is positive definite and tr(c) is 
the cross section of w for fixed c^y. Then 

ffUv, «) = XI «<y f I Cti 1“"-'” H dc 

■ f (1 - 

Jw(9) 

It is known [6] that, in order to have 
(11) A.(0, a) = c 

for all ayy, it is necessary and sufficient that 

f (1 — dx « B€, 

Jwie) 


( 12 ) 


where B 


f (1 - XiXi} 


i<n-p-l) 


n dx. 


The T*-te8t is the test based on the critical region 

Wb: XiX{ « c’W/ = rV(l + T*) ^ const., or T* > const.. 
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where is the general element of [cijT^ and is, except for a constant factor, 
Hotelling’s generalization of “Student’s” ratio. 

In order to establish an optimum property of analogous to that of £?* given 
in Theorem 1, we define, for any linear set S and any region R in the sample 
space, 


MS) = / 


<>()II dfi da. 


^b{S) does not necessarily have a finite value, and it is this fact which renders 
the following theorem less satisfactory than Theorem 1. 

Theorem 2. Let pp he the amaUeat latent root of [Cij] and let E be any aubaet 
of D in which pp ia at leaat equal to a fixed poaitive conatant. Of all the critical 
regiona w which satisfy (11), the region Wo has the maximum 
In order to prove this theorem we need the following two lemmas. 

Leacma 1. If c is a poaitive constant y the integral 


has a finite value. 

Proof. Let pi , • , pp be the latent roots of [c<y] in the descending order 

of magnitude. From a known theorem [1] we get 






Hence I is finite. 
Lemma 2. 


(p. • • • Pp)"*'" n (Pi - py)n dp 


f ' C (6 • 


(13) i^MS) ^'Lgk f I c,y dc f (1 - xai)*"‘~’’~'-\x,x,)^n dx 

fc-O •'iP Jw(e) 

and ^wb{ 8) ia finite, where gu depends only on k and S, 

Proof. Let A be the set of points (a<y) for which [a,*y] is positive definite. 
By (8), we have 

MiS) f 1 c,y - y,y, U dy dc f 1 a.yda, 

JtoB •A 


exp (’-iaijiiiVj + otijyiTij)!! dr\. 


There is a real non-singular matrix G = \gi^ such that [«»;] = GG'. Using the 
transformation 

[i/i, • • * , i?p]G = [fi» • • * > fp]? 
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1-4 




whose Jacobian is | G « | 1*^, we have 

J |a<yr^ f exp df* 


(14) 


This is reducible by means of a rotation to 

J==lo«,l"*f exp(-iT,T< + (a</y<tfAj)ndT 

ce 

“ 1 «*7 r' ^ dkioLijViVif t 

fc—0 

where 

-ciiL/"w 

and dit depends only on k and S. Hence 

f 1 an J n da = 2 d» /* , 

Ja *“*0 




where 

(15) 

Now 

\vhei*e 




/. - |/(i) 


(-iO 


= f = Kilc.7-2<y,yj 

= -2<c«y.»iV 


~K»*+p+i) 

.0* 


Hence 

(16) 

where 


h = ^ib i 






ek = 
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Hence 

= If g dtet 1 I Cy - y,y, l*'*-'-“(c«».yy)*n dy dc 

^'ll 9k f \c<s n <fc f (1 - *,*,)“"-'-"(***,)* n dx, 

fc-o Je J»(c) 

where Qk »= Ktdtfik depends only on k and S. 

Now 

f (1 — X{Xi)^^'''^’^^\ziZi)^ndx < f ndx, 

•'v(c) JasiXi<l 

f iCor^’’**’ndc < f 1 C<y 1“*'*^’n dc 

Je 

is finite by Lemma 1. Hence 

\ 

to oo r* 

^ve(S) < const, ^dkek == const. 2 ~ 

ib-4) Je->0 

and so is finite. This proves Lemma 2. 

Proof of Theorem 2. Since ^v,e(S) is expressible as (13) and is always finite, 
it follows from (12) and the Neyman-Pearson l^mma that ikwxiS) is maximum 
when to is . This proves Theorem 2. 

Simaika [6] proved that of all the critical regions w which satisfy the conditions 

(a) ^ 10 ( 0 , or) = € for all aij , 

(b) a) = fiaijtiirjj), 

Wq is the uniformly most powerful one. Strangely enough, this result cannot 
be deduced as a consequence from our Theorem 2. 

The difficulty in dealing with the integral ^w(*S) is that it Is not always finite. 
In order to have a finite integral let us consider the following: 

T^(e,S)=‘l Cl) n dv da, 

•'a »’»/*« 

where is a positive definite matrix. As an immediate consequence of 
Simaika’s theorem we have 

(17) s) < r„,( 0 , sy. 

for any region w satisfying (a) and (1>). Now the question arises whether 
(17) remains true if the condition (b) on w is removed. The following theorem 
answers this question in the negative. 

Theorem 3. Let [Bij] be a positive definite matrix, [pij] ~ [c^- + and 
M , • • • y \p be the roots of the equation | c,/ — XBij | *= 0. There is a function 
g =* ^(Xi, • • • , Xp) such that the region 

• pimvi > fi^(Xi, • • •, Xp) 
satisfies (a) and has the maximum r,p(^, <S^). 


(<.!)■ 
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Proof. From (10) and (14) we obtain 
r„(9, S) = f I Ci^ - Pit//n dp dc 

» A 

Comparing the inner integral with (15) and using (16) we get 

T„{e,S)=f:9* /+ 

*—0 ^vi 

(18) = E (7* / I Cy + Oii r‘'’‘+'’+" I c<y n dc 

fc-0 Jd 

■f (1 - x.aj<)“"-'^”(7«®4a:y)*n di, 

where yijXiXj is the result of applying the transformation (9) on pi^y^s. We 
shall show that, for every fixed set of , a unique number g = gf(Xi, • • • , X,) 
exists such that the region p,i2/,-2/i = yijXiXj > g satisfies (12), i.e. 

(19) f (1 - x.x<)“"~'““ndx = Be. 

Since [y*,] = T'[cij + BiiT^T, the latent roots of [y,y] are X,/(l + X,) (i = 1, 
• • • , p). Hence by a rotation the equation (19) is reduced to 

(20) f (1 - n = Be. 

As g increases from 0 onwards, the left member of (20) decreases steadily from 
B to 0. Hence there is a unique g = gi\i , • • • , Xp) which satisfies (20). 

For this fir(Xi, • • • , Xp) the region Wi satisfies (a). Hence, applying the 
Neyman-Pearson Lemma on (18) we obtain the result. 

From Theorem 3 we learn that there actually exist other exact tests for H” 
which have some optimum property not possessed by viz., the tests based 
on the critical regions Wi corresponding to various values of the dij . However, 
the great difficulty in numerical computation prohibits their application and the 
r^-test stands out as the only test which is both simple and good. 
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SOME GENERALIZATIONS OF THE THEORY OF CUMULATIVE SUMS 
OF RANDOM VARIABLES 


By Ajbbaham Wald 
Columbia University 

1. Introduction* In a previous paper [1] the author dealt with the following 
problem: Let { 2 ,} (f = 1, 2, • ‘ , ad inf.) be a sequence of independently dis¬ 
tributed random variables each having the same distribution. I^et a be a given 
positive constant, h a given negative constant and denote by n the smallest 
positive integer for which either 

( 1 ) + • • * + ^ 0 

or 

(2) ^1 + • • * + < [) 

holds. The main problems treated in [1] were; (1) Derivation of the probability 
that the cumulative sum reaches the boundary a before the boundary h is reached; 

(2) Derivation of the characteristic function and the distribution function of n. 
In this paper we shall consider the following moie general problem: Let K = 

, • • • , 2 <)} (^ = 1, 2, • • • , ad inf.) be a given sequence of functions and let 
n be the smallest positive integer for which either 

(3) kn(Zl , • • • , 2n) > 1 
or 

(4) kn(zi > • • • , 2fn) ""1 

holds. No restrictions are imposed on the sequence K except that it must be 
such that the probability that n < 00 is equal to one. The purpose of this 
paper is to derive some theorems concerning the probability that Kizi , * • • , 2n) 
> 1 and concerning the expected value of n. Obviously, the problem formulated 
here is a generalization of that considered in [1], since the latter can be obtained 

2 Qf —^ b 

by putting , • • • , x<) = - Z (*x + • • • + «•) - --r. 

a 0 a — o 

2. The conjugate distribution of 2 . Let 2 be a random variable whose dis¬ 
tribution is equal to the common distribution of z,*. In this section we shall 
introduce the notion of the conjugate distribution of z which will be used later. 
According to Lemma 2 in [1], under some weak restrictions on the distribution 
of z there exists exactly one real value ho 0 such that 

(5) = 1 

where E{u) denotes the expected value of u for any random variable u. 

287 
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For simplicity we shall assume that z has a continuous distribution admitting 
a probability density everywhere, or that z has a discrete distribution. By the 
probability distribution /(«) of z we shall mean the probability density of if 
the distribution of z is continuous. In the discrete case f{z) will denote the 
probability that the random variable takes the value z. From (6) it follows that 

(6) rW = 

is a probabiUty distribution. We shall call /*(«) the conjugate distribution of z. 
For any random variable u we shall denote by the expected value of u 

under the assumption that the distribution of z is given by/*( 2 ). The expected 
values E{u) and E*{u) may depend on the sequence K = {ki{zi^ • * • , 2^0} 
{i = 1 , 2, • • • , ad inf.). Occasionally we shall put this dependence in evidence 
by writing E{u\K) and E*(u | K), respectively. 

3. Two theorems. In this section we shall derive two theorems. The first 
theorem is concerned with the probability that /cn(2i, • • • , 2n) > 1 and the 
second theorem with the expected value of n. In what follows the operator Ei 
will mean conditional expected value under the I'estriction that kn{zi ^ Zn) 
> 1 and E 2 Avill mean conditional expected value under the restriction that kn 
{zi y • • • , Zn) < If the distribution of 2 is given by/*( 2 ), these conditional 

expected values will be denoted by the operators E^ and E^ , respectively. 

Theorem 1 . Let K — [kiizi , • • • ,2,)} he a sequence such that the ’probability 
that n < ^ is equal to one under both distnbidions f{z) and f'^iz). Let y denote 
the probability that kn(zi , • • • , 2n) > 1 when f{z) is the distribution of 2, and let 
7* denote the probability of the same event when f*{z) is the distribution of 2. Then 


(7) 


- y* . 

> 


_l-y* 

i 

T 

1-7 

and 





(8) 


^ y _. 

T* ' 

Et(e~^"'‘°\K) 

II 

1 I 


where Zn = 21 + * * * + . 

Proof: From (6) it follows that 


(9) 

.*,*0 _/*(2l) • 

* M) ■ 

• •/(«.) 

and 



(10) 

— -zbAo __ 

/*(«!)• 

• • /(*n) 

• • /*(«»)■ 


A set ( 21 , • • • , Zn) will be said to be of type 1 if and only if —1 < km(zi , • • • , 

Zm) < 1 for m = 1, • • • , n - 1 and kn{zi , • • • , 2n) > 1. Similarly a set ( 21 , 

• • • 2n) will be said to be of type 2 if and only if -1 < km{zi, • • • y Zm) < 1 for 

m = 1, • • • , n — 1 and kn{zi , • • • , 2 „) < ~ 1. 
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We shall prove Theorem 1 under the assumption that the distribution of * is 
discrete. Because of (9) we have 


( 11 ) 






'rM 


/*(gn) 

/(^n) 



Z rw •••/*(*») 

£ /(*y•••/(*-) 

(«i. •••.»«) 


where the summation is to be taken over all sets ( 21 , • • • , Zn) of type 1. But 

y* 

the last expression is obviously equal to — and, therefore, the first equation in 

y 

(7) is proved. The second ecjuation in (7) follows in the same manner if we take 
into account the fact that the probability that n < 00 is equal to one. Similarly, 
equation (8) can be obtained from (10). The proof can easily be extended to 
the case when the distribution of 2 is continuous. Hence, Theorem 1 is proved. 
Theorem 2. If Ez ^ 0, the relation 


( 12 ) 


E(n 1 K) 


E(Zn\K) 

Ez 


holds for any sequence K = [ki{zi , * * • , 2 ,)} for which one of the following two 
conditions is fulfilled: 

(a) There exists an integer N such that the probability that n < N is equal to one, 

(b) E{71 I iiC) < oo and the first four moments of z are finite. 

Proof: First we shall show that condition (a) implies the validity of (12). 
For any integer i we shall denote Zi + Zi by Zi, Since the probability 

that n < iV is equal to 1, we have 

(13) E{Zn 1 K) + • +z^) = EZs = I^Ez. 


Since the conditional expected value of (zn-^i + • • • + ^w) for a given value of 
n is eciual to (N — n)Ez^ ^ye have 

(14) E{zn-,i + • • • + 2w) = E{N ~ n | K)Ez = NEz E{n | K)Ez, 


Equation (12) follows from (13) and (14). 

Now we shall show that condition (b) implies (12). Denote by Pn the prob¬ 
ability that n < N. Let the operator En denote conditional expected value 
under the restriction that n < N, and let the operator Elf denote conditional 
expected value under the restriction that 71 > N, Then we have 

(15) 1\EM + (1 - l\)E'M = E{Zs) - NEz. 


Since 


= Esi^n I E) -f EffiZn+l + * * • -b i!Ar 1 K) 
(16) EsiZ^) « Ef^iZn I K) + Es(N - n 1 K)Ez 
* Es(Zn 1 K) + NEz - Esin j K)Ez, 
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we obtain fr(Hn (16) 

(17) P^,{EAZn 1 K) + NEz ~ Esin 1 K)Ez] + (1 - Ps)E'sils) = NEz. 
From Ein | /T) < « it follows that 

( 18 ) lim (1 - Ps)N = 0. 

' ' JV—00 

Now we shall show that (18) implies the validity of 

(19) =* 0. 


J^et Ts = Zs 
( 20 ) 


NEz. Because of (18), (19) is proved if we can show" that 
lim (1 - Ps)E's(Ts) = 0. 

JV-oo 


Denote by Rs the set of all points ( 21 , • • • , zs) for which n > N. Then the 
probability measure of Rs is equal to 1 — Ps and 


( 21 ) 


(1 - Ps)EsiTs) = f Tsfizi) • • • f(zs) dz, - - dzs . 
Jr., 


Jjet R]f be the part of Rs in w hich Tx < —N, R% the part of Rx in which Tx> N 
and R]i the part of Rx in which —N<Tx<N. Because of (18) we have 


(22) lim f Txfizi) • • • fizx) dzi - • dzx < lim (1 - Px)N = 0 . 
Jr*. 


Denote the cumulative distribution function of Tx by FxiTx). Clearly, 

(23) [2 Txfizi) • * * f{zx) dzi • • • dzx ^ [ Tx dFxiTx) ^ ^ f T]i dFxiTx)* 

Jr* Jn Jy Jtr 

Tx 

Since the first four moments of z are finite, the 4-th moment of (;onverges 
to 3o-^ where <r is the standard deviation of z. Hence 

(24) lim I^^TUF^(T>,) = 3^. 

X>^» J—00 IS 


From (23) and (24) it follows that 

(25) lim f Txfizi) • • • fizx) dzi - ^ dzit ^ 0. 

JV-oo JR j^r 

Similarly we can pro\'e that 

(26) lim f, Txfizi) • • • f(zx) dzi • • • dzx * 0. 
x-^Jr^ 

Equation (20) follow"s from (21), (22), (25) and (26). Hence (19) is proved. 
From (17), (18) and (19) we obtain 

limP^fF^^nliS:) - Ex(n\K)Ez} = 0. 


(27) 
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Since Ez 9 ^ 0 , lim Pj,r 1 , lim Es(n | K) « E(n | iiC) and lim Et^Zn ] K) 
E{Zn I K), equation ( 12 ) follows from (27). Hence condition (b) i^nplfes ( 12 ) 
and Theorem 2 is proved. 

4. Lower Hmit of E{n | K). In this section we shall derive a lower limit for 
E(n\ K). First we shall prove the following lemma. 

Lemma 1 . For any random variable u we have 

(28) < Ee^. 

Proof: Inequality (28) can be wTitten as 

(29) 1 < Ee"' 

where u' = u — Eu. Lenuna 1 is proved if we show that (29) holds for any 
random variable u' whose mean is zero. Expanding r”' in a Taylor series 
around v/' = 0 , we obtain 

/* 

c*" = 1 + u' -h ^ where 0 < i(u') < u\ 

Hence 

=: 1 4. > 1 

and J^emma 1 is proved. 

Now we are able to prove the following theorem. 

Theorem 3 . Let K = [Ki{zi , • • * , z,)J he a sequence of f unctions s-uch that 
the probability that n < 00 is one under both distributions f{z) and p{z) of z. Let 
y be the probability that Kn(zi, • • • , Zn) > 1 when f{z) is the distribution of z, 
and let 7 * be the probability of the same event when f*(z) is the distribution of 2 . 
Then 

(30) E(.n \K)>^^ [t log + (1 - T,) log ] 

and 

(31) + (1 - T') log 

provided that Ez and Ez* are not equal to zero. 

Proof: First we shall prove Theorem 3 in the-case when there exists an integer 
N such that the probability that n < Nib one. According to Theorem 2 we have 

(32) £(n 1 K) = ^ [yE^(Z. IX) + (1 - y)E,iZ^ 1 if)]. 

From Lemma 1 and Theorem 1 it follows that 

hoEi(Zn I if) < log ^ and hEt(Zn \ K) < log 1-^*. 


(33) 
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From (32) and (33) we obtain 

hoS!zE(n\K) « hlyEi(Zn\K) 

(34) 1 — 

+ (1 - y)Et(Zn\K)] < ylogl- + (1 - Y)lQg . 

Inequality (30) follows from (34) if we can show that hoE{z) < 0. From » 
1 and Lemma 1 it follows that hoE(z) < 0 . Since ho ^ 0 and E{z) ^ 0 , we must 
have hoE(z) < 0 . Hence (30) is proved. To prove (31) we proceed as follows: 
From Theorem 2 we obtain 

(35) -hoEz*E*(ri\K) = -holy*Et(Z.\ K) + (1 - 7 *)S?(Zn | K)]. 
From Lemma 1 and Theorem 1 it follows that 

-Uy*EUZn\K) + (1 - y*)Et(Zn\K)] 

(36) ^ 1 _ ^ 

< 7* log ^ + (1 - 7*) log 

From (35) and (36) we obtain 

(37) h, E*(z)E*(n | iiT) > 7 * log ^ + (1 - 7 *) log • 

y 1—7 

Since E*e~^^" = 1 it follows from Lemma 1 that —hoE’^z < 0. Inequality (31) 
follows from this and (37). Hence Theorem 3 is proved in the special case when 
there exists an integer N such that the probability that n < AT is ecpial to one. 

To prove Theorem 3 in the general case, for any integer N let the sequence 
Kjf = (fc»jv(zi, • • • , Zi)} be defined as follows: kis{zi , • • • , z,) = ki{zi , • • • , z.) 
for i < N and ^^Ar(Zl, • • • , 2 ,) == 1 for i > N. Denote by ya- and 7 * the values 
of 7 and 7 *, respectively, if the sequence K is replaced by /vy . Then we have 

(38) Ein I K) > E{n \ Ks) > \yn log ^ + (1 - ys) log 

/^AzL ys 1 — 7jvJ 

and 

(39) E*{n I K) > E*{n \ Ky) > [ 7 ^ log + d " T*) log 

Since lim ys — y and lim 7 * = 7 *, inequalities (30) and (31) follow from (38) 

JV-oe jV-ao 

and (39). Hence the proof of Theorem 3 is completed. 

6 . Remarks added in proof. The results obtained in the present paper have 
obvious applications to sequential analysis. These applications are, however, 
not mentioned here, because at the time the present paper was submitted for 
publication, sequential analysis constituted classified material. In the mean¬ 
time, the material on sequential analysis has been released and was published in 
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this Journal, June, 1945. The results obtained in the present paper are nuae 
general than thoeeiohtained in connection witli sequential analysis. Thecaem d, 
in the present paper, implied the efficiency d the sequential probability ratio 
test discussed in Section 4.7 of the paper on sequential tests. 
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ON THE DESIGN OF EXPERIMENTS FOR WEIGHING AND BiAKING 
OTHER TYPES OF MEASUREMENTS 

By K. Kishen 

Department of Agriculture, Lucknotc, India 

1. Introduction. In a recent paper, Hotelling [1] has discussed the basic 

principles of the theory of the design of efficient experiments for estimating the 
true unknoA\Ti weights of p given objects by means of a specified number N of 
weighings, p < iV in case the scale is free from bias and p < — 1 if it has a 

bias the unkno\\Ti value of which has to be estimated from the same data. He 
has emphasized the importaiK^e of these designs in other kinds of measurements 
besides weighing of objec^ts and has called attention to the need for further 
mathematical research for obtaining a ‘^comprehensive general solution.” Sucli 
a solution has now been obtained in case the number of weighings N is at our 
choice. Some other general designs have also been given in this paper for 
specified values of .V and p. 

2. Estimation of unknown weights and efficiency of a design. Using 
Hotelling’s notation, we may wite 

(1) E(ya) ^i,Xiab, 

»—1 

where i = 1, 2, • • • p, on the assumption that there is either zero bias in the 
scale or the bias is known a priori, and ot = 1,2, • • • A’. E(ya) is the expecta¬ 
tion of the ath weighing. For a biassed scale, we may take i = 0, 1, 2, • • • p. 
The efficient estimate of each of the 6<’s has been derived by Hotelling by the 
method of least squares. It is of interest to obtain these estimates by the use 
of the theory of linear estimation as developed by Bose [2] and Rao [3]. 

Assuming that j/i, 2/2, * • • are N stochastic variates forming a multi¬ 
variate normal system with the variance and covariance matrix gi\^en by 

(2) u = [Uij], 

it follows from Rao’s generalization of Markoff’s theorem that the best unbiassed 
estimates of the 6,’8 are given by the solutions of the normal equations 

(3) X'rr'XR' = X'LT'F', 

where B = [6162 • • • hp] and Y = [piy 2 • • • Vn], and R' and Y' denote as usual 
the transpose of the row vectors B and F, i.e. column vectors. 

In the present case, the assumption is that all the N stochastic variates are 
uncorrelated and have a common variance cr^, so that 

(4) = 
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Hence the normal equations in (3) reduce to 

(5) X'XB' = XT', 

which are exactly the same as the normal equations given by HoteUixug, since' 

(6) X'X = M 


where a,-y = S{xiaXja) 

I^et C = [c*^] denotes the reciprocal of the matrix X'X, so that V{hi) = cuir^ 
and cov (6,6 y) = c^y<r^ Then the mean variance of the p unknowns for a design 
is given by 


(7) 


^2 X ^ Cii 

N 


If the main objet^t of the experiment is to estimate the unknowns with the 
least variance, the most efficient design (for a sjx^cified value of N) would be 
the one for which the minimum mininwrum of (t^/N is attaineS for all the p 

p 

unknowns so that the mean ^'arian(•e in this case is (r\/N. The factor, N 2 ^a/Vi 

on the right-hand side of (7), therefore, measures the increase in variance result¬ 
ing from the adoption of any design other than the most efficient design. Its 


E 


reciprocal, 


N'tcu' 


may appropriately be defined as the efficiency of a given 


design for providing estimates of the p unknowns. This quantity will now be 
utilized for judging the relative precision of the general designs discussed in the 
subsequent paragraphs. 


3. Design for N = 2"”, p < 2”" (zero bias) or p < 2”* — 1 (non-zero bias). 

By utilizing the properties of a 2-sided m-fold completely orthogonalized Hyper- 
Graeco-Latin hyper-culw of the first order introduced by the author [4], it is 
easy to see that for N == 2”', p < 2”* (when there is zero bias) or p < 2"' — 1 
(when there is bias), m Iwing any positive integer, a completely orthogonalized 
design can be coastructed with each unknown weight estimated with the mini- 
m^i^^'variance a-^/N. As I'emarked by Hotelling in the case of iV *= 4, p = 4 
(for zerrf bias) or p = 3 (if there is bias), the matrix X'X for this design is a 
scalar matrix of order p X p if there is zero bias, or of order (p + 1) X (p + 1) 
if there is bias, each of the diagonal elements being N. The reciprocal matrix 
is also a scalar matrix in which each of the diagonal elements is 1/A^ so that the 
estimates of all the unknowns are mutually orthogonal. 

As a particular case of this general design, we may take X =* 16, p * 16 (for 
zero bias) or p « 15 (if there is bias), the completely orthogonalized design for 
which is represented by the matrix 
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for which X^X is a scalar matrix of order 16 X 16, each diagonal element l)eing 
16. Again, a completely orthogonalized design for AT == 16, p < 16 (for zero 
bias) or p < 16 (if there is bias) is represented by a matrix X obtained from the 
matrix in (8) by omitting any 16 — p of its columns if thei’e is zero bias, or 
16 — p — 1 of its columns if there is bias. In the matrix X, permutation of 
rows and columns is permissible and each such matrix represents a completely 
orthogonalized design. 

For the design given by Hotelling^ for AT = 4, p = 3 (zero bias), the efficiency 
is 35 per cent. The completely orthogonalized design for which the efficiency 
is 100 p)er cent is represented by the matrix 

1 ll 

1 -1 
1 1 

1 -1^ 

4. First design for AT = 2”* + 1, p < 2” (zero bias) or p < 2” — 1 (non-zero 
bias). For N = 2"” + 1, p < 2"* (zero bias) or p < 2*" ~ 1 (if there is bias), 
m being any positive integer, probably the most efficient design available seems 
to be that represented by the matrix X obtained fi-om the corresponding matrix 

1 The allusions hero and at the end of the next section are to designs on p. 305 of the 
Hotelling paper [1], a passage concerned with designs subject to the restriction that the 
entries on the matrix be O’s and +l’s only, as is necessary in many types of measurement. 
The more efficient designs given above, whose matrices involve ~l*s also, can be used only 
in such cases as that of weighing in a balance, where the objects under investigation can be 
put, some in one pan and some in the other. Such situations are considered in a different 
part of Hotelling^s paper. 
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X fcv the general ^siga of Section 3 above by adding a row 1,1, 
The matrix X'X for this design then comes out as 


m 

1 to H. 


( 10 ) 


X'X 


N 1 1 
1 AT 1 
1 IN 


1 ... JV 

which is a symmetrical matrix of order p X p if there is zero bias, or of order 
(p + 1) X (p + 1) if there is bias. The variance of each unknown for this 
design is 


( 11 ) 


or 


N — ^ ^ 

i\r + p-2 


for zero bias, 


( 12 ) 


N - 


N + p — 1 


Thus the efficiency of this design is 


(13) 


or 


(14) 


1 - 


p - 1 


mN + p - 2) 


1 - 


p 


if there is bias. 


for zero bias. 


if there is bias. 


N(,N + P - 1) 

The loss of efficiency resulting from the adoption of this design is, therefore, 
P - 1_r_l. .-_P_ 


for zero bias or' 


if there is bias. 


ar/ar ■ r\\ JH/Avr Macao v/a ar/ar • ’t\ ** 

NiN + p ~ 2) N{N + p — 1) ^ 

As a particular case of this, for iV' = 5, p = 2 (zero bias), probably the most 
efficient design available is specified by 


(15) 


X 


1 

1 

1 

-1 

-1 


5<r* 


The variance of each unknown in this case is ~ and the efficiency of the design 

is 96 per cent. For the design given by Hotelling for this case, the variance of 
4(r“ 

each unknown is — and the efficiency is 35 per cent. It would thus appear 
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ih&t, as judged by the criterion of efficiency as defined here, the design repre¬ 
sented by the matrix in (15) is more efficient than Hotelling’s design. 

6. Second design for iV = 2** + 1, p < 2"* (zero bias) or p < 2"* - 1 (non¬ 
zero bias). Another interesting design for these values of N and p is that 
represented by the matrix X obtained by adding a row 1, 0, • • • 0 to the cor¬ 
responding matrix X for the general design in Section 3 above. The matrix X'X 
for this design is then the diagonal matrix 




'n 

0 

... 0 

(16) 

X'X = 

0 

N - 1 

... 0 



0 

0 

■■■N - 


of order p X p (for zero bias) or (p + 1) X (p + 1) (for non-zero bias). As 
the reciprocal of this matrix is also a diagonal matrix, the estimates of all the 
unknowns are mutually orthogonal. The efficiency of this design is 


(17) 

(N - Dp 

Np — 1 

for zero bias, 

or 



(18) 

N - 1 

N 

for non-zero bias. 


By comparing the efficiency of the first design given in (13) and (14) with that 
of the second design in (17) and (18) respectively, it would appear that the 
efficiency of the first design is always higher than that of the second design for 
non-zero bias, and is also higher in the case of zero bias for p > 1, but equal for 
p = 1. 

6. First design for iV = 2*” -b r, p < 2”* (for zero bias) or p < 2^” — 1 (for 

non-zero bias). For iST = 2*” -f- r, p < 2”* (for zero bias) or p < 2^” — 1 (for 

non-zero bias), m being any positive integer and r any positive integer < 2’", 
a highly efficient design is represented by the matrix X obtained from tlu‘ 
corresponding matrix X for the general design in Section 3 above by adding r 
rows 1, 1, • • * 1 to it. The matrix X'X for these designs then comes out as 

JV r r • •• r 
r N r r 

(19) X'X = r r N --r 

N 
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which is of order p X p for zero bias, or of order (p + 1) X (p + D ior lUHi^aero 
bias. The variance of each unknown determined by this experiment is 


^ _ 

(p — ly 

AT + (p - 2 )r 


fmr zero bias, 


Af + (p - l)r 


Hence the efficiency of this des^ is 


(p - Dr* 
Ar[Ar + (p - 2)r] 


if there is bias. 


for zero bias, 


ATIAT + (p - l)r] 


if there is bias. 


The loss of efficiency as a result of adopting this design is, therefore, 
(p — l)r* vr^ 

T,rrxi i / -zero bias, or 7 - 77 “, if there is bias. 

iV[iV^ + (p — 2 )r] ' N[N + (p — l)r] 

7. Second design for iV = 2”* + r, p < 2"* (for zero bias) or p < 2*^ — 1 (for 
non-zero bias). Another design for these values of AT and p is that represented 
by the matrix X obtained from the corresponding matrix X for the general 
design in Section 3 above by adding to it r rows 1 , 0, 0 , • • 0 . The matrix X^X 
for this design is then given by 

AT 0 0 • • Q 

0 AT - r 0 • • 0 

(24) X'X = 0 0 0 


which is of order p X p if there is zero bias, or of order (p + 1) X (p + 1) if 
there is bias. Here also the estimates of all the unknowns are mutually orthog¬ 
onal. The efficiency of the design comes out to be 


(AT - r)p 
Np — r 


if there is zero bias, 


(26) 


AT - r 
AT 


if there is bias. 
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By comparing the efficiency of the first design of this type given in (22) and 
(23) with that of the present design given in (2S) and (26) respectively, it would 
appear that in case of zero bias, the efficiency of the first design is higher than 
that of the second design for p > 1, but equal for p » 1; and in case of non¬ 
zero bias, the efficiency of the first design is always higher than that of the 
second. 

8. Comprehensive general design when JV is at our choice. When is at 
our choice, we can always obtain a completely orthogonalized design by taking 
N equal to a sufficiently large pow^r of 2. For p = 2” m being any positive 
integer, a completely orthogonalized design for N » 2*", when there is zero 
bias, has been given in Section 3 above. If, however, there is a bias, a com¬ 
pletely orthogonalized design can be constructed for N ~ When p == 

2 ”* + ii, where u is a positive integer < 2*”, a completely orthogonalized design 
is available for N = 2*"^^ whether the bias is zero or not. 

For N = 2"**^*, this is the most efficient design, with 100 per cent efficiency, 
but as iV is given higher powers of 2 than the variance of the estimate of 
each imknown decreases. When N = 2\ where J > m + 1, the variance of 

each unknown is of that for N = 
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NOTE OK THE LAW OF LARGE NUMBERS AND ^^FAIR” GAMES 

By W. Feller 
Cornell University 

1. *Tair” games. Let {Xit} be a sequence of independent random variables 
with the same cumulative distribution function V(x), Suppose that the ex¬ 
pectation 

(1) E(Xk) -- xdV(x) ^ M 

J-«o 

exists, and put 

(2) Sn^ Xi+ ••• +Xn. 

The weak law of large numbers states^ that for every c > 0 and n « 

(3) FT{\Sn-nM\ <en}-^l. 

In the picturesque language of the theory of games this means that, after a 
large number of trials, the accumulated gain Sn will, with great probability, be 
of the order of magnitude of nM, This led to the definition that a game is 
“fair” if the entrance fee for each trial is M, Unfortunately this definition 
creates the erroneous notion that a “fair” game is necessarily fair. To disprove 
it we shall (section 3) exhibit an example which will show: 

(I) A game can be **fair*^ and nevertheless such that the probability tends to one 
that, after n trials, the player will have sustained a Igss Ln = nM — Sn of the order 
of magnitude n(log n)”"’, where i? > 0 is arbitrarily smaU, In other words, in our 
example 

(4) Pr [nM — <Sn > (1 — €)nOog n)“’| —> 1. 

Of course, Ln is necessarily of smaller order of magnitude than n; however, our 
example can be modified in such a way that the ratio of the loss Ln to the ac¬ 
cumulated entrance fees riM decreases as slowly as one pleases. 

This shows that a “fair” game can be exceedingly disadvantageous. Con¬ 
versely, an “unfair” game can very well be advantageous. If a careful driver 
insures his car, the game is clearly “unfair” according to definition, and yet some 

' Usually (3) is proved only under more restrictive hypotheses. Actually the finiteness 
of E{Xk) implies even the strong law of large numbers; cf. Kolmoqoroff, Orundbegriffe der 
WahrscheintichkeiUrechnung (Berlin 1933),p. 69. 
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ao2 

states impose such games on drivers. Now in this and many other practical 
cases the game is of such a nature that there is a very small probability p of 
winning a comparatively great amount A \ the ‘‘fair” price would be pA. In 
such cases the law of large numbers would be significant only if n is large com¬ 
pared to 1/p, whereas actually the maximum number of games to be played is 
comparatively small. Clearly any theory meets practical requirements only 
if it makes allowance for the number of trials and makes the “fair” price depend 
on the number of trials. 

2. The Petersburg *^paradoz«” For obvious reasons the classical theory of 
probability was unable to provide a precise formulation of the law of large 
numbers and to establish the actual conditions of its validity. Often it has 
been looked upon as a direct consequence of the definition of probability, and 
this led to the so-called Petersburg paradox which presents no difficulties to the 
modem theory. It refers to the case where the expectation (1) is infinite. The 
usual example exhibits a game in which the possible gains in each trial are 
distributed according to 

(5) Pr {X = 2*} = 2"\ 

Hei^ ilf = «. Now the laV of large numbers (3) used to be proved (if at all) 
only assuming the existence of moments of higher order. Nevertheless, the 
classical theory postulated the validity of (3) even for M = x, and treating « 
as a number (with « ~ ao = 0) it argued that is a “fair” price for the game 
as defined in (5), Great ingenuity was exercized in order to reconcile this 
result with commonsense.^ Actually one can pass from (3) to the limit ilf —> oo, 
but the only result to be arrived at is trivial and could be anticipated without 
theory: If the player pays for each trial sl fixed amount A, he is likely to have a 
positive gain provided he plays sufficiently long, i.e., provided n > A(A), 
where N{A) itself increases with A. 

Instead of a paradox we reach the conclusion that the price should depend on 
n, that is to say vary as the number of trials increases. For best residts this 
should be the case even if il/ is finite. It should be noticed that in the Petersburg 
case (5) a variable price can be determined so that a law of large numbers will 
hold which is in every respect analogous to (3). In this formula nM is simply 
the accumulated amount of entrance fees; denoting it by Pn , formula (3) takes 
on the equivalent form 


* Among the latest textbooks, von Mises (^ahracheinlichkeiisrechnung^ Leipzig-^Wien 
1931, p. 108f.) avoids the difficulty by declaring that (6) can not represent a collectif because 
of its infinite tail. This viewpoint is legitimate, but makes the law of large numbers inap¬ 
plicable to practically all useful distributions. Fry (Probability and its Engineering Uses, 
New York, 1928, p. 197) says: “The true explanation of the paradox is . . . based upon the 
fact that in our every-day experience we have to deal only with individuals who have finite 
fortunes and who would therefore be incapable of paying back the sums which are required 
. ..”. The problem does not seem to be mentioned in Uspensky’s book. 
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(6) ^ Pr{lS.-Pn|<iPnl-^l. 

It is this interpretation of (3) that leads to the notion of “fair^’ games. Now 
the Petersburg game can also be played in a ‘‘fair” way; 

(II) Let the player in the Petersburg game (5) at the k-th trial pay the amount^ 
log 2 k. The accumulated entrance fees up to the, n-th trial are Pn n logs fiy 
and the game is *fair** in the sense that the law of large numbers (6) holds. This 
requirement determines the entrance fees essentially uniquely (that is to say up to 
terms of smaller order of magnitude which, by definition, remain undetermined). 

9 

3. Proofs. Theorems (I) and (II) follow easily from the following 
Lemma: Let an^ he a sequence of positive numbers'; in order that there exist 
a sequence. {6«} such that 

(7) Pr ll^n - < €a„) 1 

it is necessary and sufficient that for every 6 > 0 simultaneously 



in this case (8) will held with 


(9) &» = i: f X dF(x) 

fc-1 J\9\<ak 

(andy of course, for any other sequence {6J} if and only if \bZ -- bn \ — 0(an)). 
This lemma is a simple consequence of the necessary and sufficient conditions 
for the generalized law of large numbers*. 

To prove theorem (II) we have to determine a sequence {On} such that (7) 
will hold for the distribution function defined in (5) and \dth On . A simple 
computation shows that (8) will hold for any sequence {on} which increases 
faster than n. Moreover, the sequence {bn] defined by (9) will be of the same 
order of magnitude as {onj if, and only if, Un ^ n log 2 n. This proves (II). 

Now let > 0 be arbitrary, and define the distribution function V{x) to have 
a density 


( 10 ) 
at X 
( 11 ) 


V\x) 


xMog'^^x 

0 the function F(x) shall have a jump of magnitude 


-r? 


n dx 


for X > e; 


while V(x) is constant in the intervals x < 0 and 0 < x < e. For this distribu¬ 
tion function we have obviously AT » 1. 

> Logs stands for the logarithm to the basis 2. 

< Of. Fslleb, Acta Univ, Szeged, Vol. 8 (1937), pp. 191-201. 
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Next, let for n > e 

(12) an ^ n log^ n. 

Then (8) holds and from (9) and (10) we obtain easily for large n 

(13) b. = L {1 - log-’a*} < n - (1 - *)a, . 

Substituting into (7) one sees that, again for sufficiently large n, 

(14) Pr {iS,* - n + (1 - 6)an < cOnl 1, 

or, since M = 1, 

(15) Pr ISn - nilf < -(1 - 26 )o„) 1. 

This proves (I). 


A NOTE ON RANK, MULTICOLLINEARITY AND MULTIPLE 
REGRESSION' 

By Gerhard Tintner 
Iowa State College 

Let Xit(i = 1,2 • • • ilf) be set of ilf random variables, each being observed at 
t ^ \,2 ' •' N, Xii = Mii + yn-, (This is essentially the situation envisaged 
by Frisch [1]). The systematic part of our variables Mu. — EXu . The yu are 
normally distributed with means zero. Their variances and covariances are 
independent of t. The Mu and yu are independent of each other. Define 
Xi — IliXit/N the arithmetic mean of Xu and Xu — Xu — Xi the deviation from 
the mean. Then an = lltXuXjt/(N — 1) gives the variances and covariances 
of the observations. We want to determine the rank of the matrix of the 
variances and covariances of Mu . 

Now assume that ||F<yl| is an estimate of the variance-covariance matrix of the 
error terms or ‘‘disturbances” yu , The elements of this matrix are distributed 
according to the Wishart distribution and are independent of the Mu \ They 
can be estimated as deviations from polynomial trends, as deviations from 
Fourier series, by the Variate Difference Method, etc. The estimates could also 
be based upon a priori knowledge if for instance the yu are interpreted as errors 
of measurement. Assume that the estimate is based upon AT' observations. 

‘The author is much obliged to Professors W. G. Cochran (Iowa State College), H. 
Hotelling (Columbia University), T. Koopmans (University of Chicago) and A. Wald 
(Columbia University ) for advice and criticism with this paper. He has also profited by 
reading the unpublished paper: “On the Validity of an Estimate from a Multiple Regression 
Equation’’ by F. V. Waugh and R. O. Been which deals in part with a problem related to 
the one presented here. 

Journal Paper No. J>1S28 of the Iowa Agricultural Experiment Station; Amea, Iowa. Project No. 730. 
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Foim the determinantal equation: 

(1) xy.vl “ 0. 

Apart from sampling fluctuations there should be r solutions X ** I of equation 

(1) if there are r independent linear relationships between the Mu . The rank 
of the variance-covariance matrix of Mu is then ilf — r. Following a suggestion 
of P. L. Hsu [2] made on the basis of the earlier work of R. A. Fisher [3] we form, 
the test function 

(2) Ar^ (N - l)(\t + X2 • * • + X,), 

where Xi is the smallest root of (1), X 2 the next smallest, etc. Hence (2) is the 
sum of the r smallest roots of equation (1). The hypothesis to be tested is that 
there are exactly r independent linear relationships between the systematic 
parts of our variables in the population. This quantity (2) is distributed like 
with r(N — M — I + r) degrees of freedom for large samples, i.e. if N' be¬ 
comes large. It can be used for forming an opinion about the number of inde¬ 
pendent relationships existing among the systematic parts of our variables (Mu). 

Th<? importance of the ciuestion of the rank lies in the following: Sometimes 
we are not so much interested in making predictions as to estimate the ‘^true'' 
relationships which exist in the population which corresponds to oiu* sample 
(W'ald) [4]. Practically speaking, these relationships and their estimation are 
of great importance in economic statistics, as Haavelmo has shown [5]. But a 
knowledge of the rank i.e. the number of independent relationships existing be¬ 
tween the systematic parts of the variables may also be of some signiflcance for 
the problem of prediction. The inclusion of strongly correlated predictors 
cuts down on the number of degrees of freedom without contributing significantly 
to the reduction of the variance. 

The remainder of this paper w’^ill be concerned with an attempt to estimate 
the relationships which in the population exist between the systematic parts 
of the variables. This is an extension of the work of T. Koopmans [6] and the 
author [7] who dealt with the special case in which there is only one relationship 
between the systematic parts. 

Suppose that w e decide that thei*e are R independent relationships among the 
systematic parts of our variables 

(3) kvQ h S KjMjt =* /i»< “ 0; * 1, 2, • • *, ill, « » 1, 2, • • •, JV. 

I 

We desire to obtain estimates of these relationships. Our purpose here is not 
prediction but estimation of the structural Coefficients kvj. ^ 

The method of maximum Ukelihocxl leads to the method of least squares if we 
treat the Fiy as constants. This is again permissible if A' is large and our esti¬ 
mates of the Vij heconw reasonably accurate. We have to minimize the follow¬ 
ing sum of s((imrrK ^ 


(4) 


<3 “ Tq. 
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where 

(S) Qt V*^(xu - muXx^t - mjt), 

vhere || F’^l * || Vij || the inverse of the variance-eovariance matrix of the 
errors. We also define « Mu — Si, ~ 1, 2, •, iST) where is the 

mean of Mu . 

If there are R relationships (3) they can be written by using only R(M — R) 
coefficients kvjij = 1, 2 • • • M), if we disregard the constant terms fco», because 
we are now dealing with deviations from means. We can for instance express 
the first (M — R) variables mu in terms of the last R variables mu . Hence, 
we have to impose R!^ conditions upon the MR coefficient-s Ki(j — 1,2, • • •, M) 
appearing in (3). 

We impose R{R + l)/2 conditions as follows 

( 6 } . = ffvw ~ > 

where is a Kronecker delta. These conditions orthogonalize and normalize 
the coefficients kvj . We have now to adjust the Qt as given in (5) un^er the 
conditions ( 6 ) by determining appropriate m, /. This is a problem of re¬ 
stricted minima. 

We introduce a new function 

( 7 ) Ft ^ Qt , 

V 

where the tivt are I^agrange multipliers. Differentiating with respect to mu and 
setting equal to zero we get the solution; 

(8) E - m„) = E Mr< fcr.-; (^■ = 1, 2, .. •, M); 

J V 

or, solving for xn — w« 

(9) Xit mu ^ f^vt Vijkvj I t = 1, 2, ' • •, M, 

V i 

Multiplying (9) by Ki and summing we get 


(10) 


Hence we have 


(11) 


Now we dispose of the remaining R{R — l)/2 conditions 


( 12 ) 2 ] * h.^ * 0 , V ^ w, 

t 

We have to maximize Q under the R^ conditions (6) and (12). This is done 
by finding the appropriate kvj . 

We form a new expression 
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m 

(13) 0 ^ Q + 

where the a^v and fivv (v 9 ^ w) axe again Lagrange multipliers and /3»v « 0. 
Because of considerations of symmetry we have: aw « a^t and 0 p^ ** . 

Differentiating with respect to kvi and lotting equal to zero we get the condition 

^t{^jkvfX^t)Xii " 4 ” ^w^irw^t^jkpijXj^Xit 

(14) 

~ ocvw 2y T^{y1 ^ ~ 1» ' * * > t = I9 2) • • ■ , M* 

Multiplying by Ki and summing we get 

(15) = avt». 

Multiplying by kai {z 9 ^ v) and summing we have 

(16) “ Ova {v 9^ z). 

Both (15) and (16) follow from conditions (6) and (12). 

Exchanging the role of v and z in (16) we have also 

(17) = otpg (v 9 ^ zY 

Hence we have ~ fiv = 0, xivi^w. Inserting these results in (14) we get a 
system of linear and homogeneous equations in the unknown coeflScients Kj, 
The determinant of the system must be equal to zero in order to yield non-tiivial 
solutions. Trivial solutions are not admitted because of (6). Hence the a^ 
are simply the roots k of the equation | XtXnXjt — kV^ | = 0. 

Introducing 

(18) X. - apJ{N - 1), 

expression (14) l)ecomes actually the determinantal equation (1). This expres¬ 
sion can be used to Hnd the smallest latent roots and the corresponding 
(jharacteristic vectors ktj by Hotelling’s methods [8]. 

The constants of the equation (3) are finally determined by the condition 
that the optimum solutions have to go through thife means of the variables 

(19) k^fs *4“ SyAjpyJty = 0. 

The distribution of the variances and covariances of the observations has recently 
l^een established by T. W. Anderson and M. A. Girshick for the cases It =» 
M - \ axiAR ^ M - 2 [9]. 
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NOTE ON THE DISTRIBUTION OF THE SERIAL 
CORRELATION COEFFICIENT' 

By William G. Madow 

Bureau of the Census 

The distribution of the serial correlation coefficient when p = 0 has been 
previously obtained,* The purpose of this note is to derive the distribution of 
the serial correlation coefficient, using the circular definition, when p 9 ^ 0. 

Let us assume that the random variables , • • • , Xiv have a joint normal 
distribution* p(xi ,Xtr\ A, B, p) where 

jOg p(xi , • • • , xjyr I B, m) 

- log Xi - i [A Z (x< - m)‘ + 2B Z (*, - - m)] 

the term in the bracket is positive definite, Ki is independent of the Xi and if 
i + L > N then x.+x, = x,+l-n . It is then clear that x, Vn , and lCh , where 
£ is the arithmetic mean, Vy = 2Z(x< — 55)* and 

i 

lCh “ Z (*< - - *) 

% 

are sufficient statistics with respect to the estimation of p, A, and B. 

Let Vjt lRn == lCn define iRif , the serial correlation coefficient. Then if 

^ Presented at a meeting of the Cowles Commission for Economic Research in Chicago, 
January 31,1946. 

* See R. L. Anderson, * ^Distribution of the serial correlation coefficient*\ pp. 1~13 and T. 
Koopmans, “Serial correlation and quadratic forms in normal variables”, pp. 14-33, Annals 
of Math. Stai., Vol. XIII, No. 1, March, 1942. 

• The expression p(fi , • • • , 1 iPi, • * * , means the probability density or the 

distribution of the random variables , *** , for the given values of the parameters 

> * * * » 9g , When used as an index of summation or multiplication, the letter i will 
assume all values from 1 through N. 
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A » B « 0 Anderson has shown^ that, if AT is odd, the joint disUibutiion of 
iRif and Vk is given by 

(1) D(B„, 7,) - Z (X, - for W ^ 


2rk 

Bh = iBn, Xt •= COB «< « II (X< — Xy), for all; t 

and fC'* = 2“""“ T[i{N - 3)]; while if JV is even, the same formula holds except 
that 

*“ II 0<i h) V(X< + 1), for all j t. 

j-i 

We now extend Anderson *s distributions to the case where it is not assumed 
that A « 1 and -0 = 0. 

As a means of extending^ Anderson’s distribution let us recall that if xi, * • • , 
Xft have a distribution p(xi Xk I Oi 0^) depending on several param¬ 
eters Oi, ••• f 0g, and if 2 i, • • • , 2 * are a sufficient set of statistics with respect 
to , • • • , 6g, i.e. 

p(x , •' • , Xjg I Oi, • • • , Bg) ^ h(zi , • • • , 2* 1 , • • • , eg)m(xi , • • • , 

where m(xi , • • • , is independent of Bi, •**, Sg, then if the distribution of 

2 i, * • • , 2 fc is found, assuming , • • • , Bg have specific values ^?, • • • , 6j , 
then it follows that 

/ 1/1 \ / I /nO /»0\ X(2l , * * * f ^k\ Bl f ***> B g') 

pi^l 1 ' * * » , • * • , Bg) = p(2l , • * • , 2fc I 01 , • • • Bg) , 2jfe 1 01 • • • 0*^) ' 

We may call Anderson’s distribution given in (1), p{Rn , 11, 0), i.e. 

p(fi^,Fi.|l,0) = D{Rs,Vs) 

Furthermore, x is distributed independently of Rn and Fjif for all values of A 
and B and hence by a simple transformation,® we can apply the above theorem. 

* Anderson loc. cit. p. 3 and p. 5. Although the remainder of the note deals only with 
the case where L » 1 the procedure is general and may be easily carried through for other 


* See W. G. Madow Contributions to the *Theory of multivariate statistical analysis’’, 
Trans, of the Amer, Math. Soc.t Vol. 44, No. 3, November 1938, p. 461. 

• For a proof that an orthogonal transformation of the variable x< — m exists such that 
Vs and lCs are simultaneously reduced t^canonical forms involving the same AT — 1 of 
the variables of the transformation, and VN (2 m) is the ATth variable of the transformar 
tion, see J. von Neumann, ‘'Distribution of the ratio of the mean square successive differ¬ 
ence to the variance. Annals of Math. Stat.^ Vol. Xll, No. 4, December 1941, pp. 368, 369. 
The proof there is given for Vs and S(ar< — aJi+i)* but is easily extended to this case. 

Then it is easy to show that JV(2 — n) is independently distributed of Vs , and ijCs and 
has distribution log pWN{^ — m) |A, 0] «■ logiC* — i[A + - m)i where Kt * 

(A + 20)1 and - /Ci. 
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3t0 

Then 


I A, S) = p(iJ^, Fj, 11, 0)0 

Where 

v' .-»Urjir+»»*Ari'iir) 

Q = _ 

Hence it follows that, 

p(R„ , Fj. I A, 5) = £ (X, - J?y)*‘"-‘Va<, 

for X,„+i < Rtf < \fn y where the a< have different values according to whether N 
is odd or even. In order to evaluate p{Rn | A, B) we then need only integrate 
out Vjf, Now 

jJ" yUir-Z) ^ 

Hence 

p(Rb I a, B) = KK[{2t)*‘’ mN - l)l(A/2 + £ (X, - 

«-il 

The parameters K [, A and B depend on the different types of assumptions that 
may be made. In general 

Ki = (2ir)"*"A‘" 

where A is a circulant (oi, • • • , ay) such that 

Oi = A, ai+L = B, ai+^y-L} = B, a, = 0 otherwise, 
and hence 

^ = n (-A + 5 cos = n (^ + Bx,). 

Then, one assumption is 

A = -„ B = -pA* 

ff 

where p is the ‘‘true” serial correlation coefficient. Other assumptions are 
possible.^ However, these vary with the problem under consideration and may 
be left for further examination. 

^ One poBsible alternative definition is given by W. J. Dixon, *‘Further contributions to 
the problem of serial correlation”, AnnaU of MaOi. 8iat., Vol. XV, No. 2, June 1944, p. 120, 
equation (2.1). 
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NOTE ON A ^APER BY C. W. COTTERMAN AND L. H. SNTDrat 

By H, B. Mann* 

Ohio StcAe University 


C. W. Cotternmn and L. H. Snyder [1] gave a method to test simple Men- 
delian inheritance in randomly collected data. From a population assumed to 
be at equilibrium a sample is taken. The number of homozygous reoessives in 
the sample is known. We wish to estimate the numl)er of heterozygous individ¬ 
uals in the sample. 

Ijet a be the proportion of recessive genes among all genes in the population; 
V, p, T the proportion in the population of homozygous recessives, heterozygous 
and homozygous dominant individuals respectively and p, r, t the sampling 
values of ir, p, r. Then 

(1) T = ay p = 2a(l — a), r = (1 — a)*, p r + t ^ 1. 


Cotterman and Snyder use as an estimate of r the quantity 2\/p(l •” Vp)- 
It is the purpose of this note to show that this estimate is for all practical purposes 
equivalent to the maximum likelihood estimate of r. 

The joint distribution of p, r and t in samples of n is given by 


( 2 ) 


P(^ r f) « « nla^^n2a(l ~ 

’ (np)\{nr)l(nt)l (np)I (wr)!(n^)! 


where P(p, Vyi) is the probability of obtaining the values p, r, < in samples of n. 
We wish to maximize P(p, r, 0 for fixed values of p \^ith respect to a and r. 
Maximizing first with respect to a one easily obtains 


(3) 2cy = 2p + r. 

We can regard a as a <!ontinuous parameter and hence (3) must hold at any 
maximum of P(p, r, t). For any maximum of P(p, r, t) we must further have 

{np)\{nr)\{7ii)\ (np)!(wr+ l)\{nl — 1)! 

and 


(np)!(nr)!(n<)I (np)!(nr — l)!(n< + 1)!* 


This leads to the inequalities 


(4) 


nt nr + I nr nl + I 


Substituting / p~r, ir — p one easily obtains from (4) 

‘ Research under a grant of the research foundation of Ohio State University. 
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( 6 ) 


fin - ftnp + fi ■> r> - flup-’ T • 
n(l — t) n(l — t) 


The difference of the two bounds is -. Hence r must satisfy an equation 


r = 


pn — pnp + p * 


0 ^ ^ 1 . 


»(1 — t) n’ 

Substituting the values for p, r and r from (1) and (3) we obtain 


-«/ 2 ) 

n 


V + 


2n 


0 , 


2 - 
4n 




( 2 - t)» 

4n» 


+ 4p 


Since 0 1 we obtain from (3) 


n ’ 


(6) i + ,j/4p + i-2pJr>i+y/4,+ i-?-2,. 

From (6) we see that for all practical purposes we may use the estimate 

r = 2\/p(l - \/p)- 
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NEWS AND NOTICES 

Readers are invited to submit to the Secretary of the Institute new items of interest 

Personal Items 

Dr. R. G. D. Allen, who has been associated with the Combined Production 
and Resources in Washington has returned to the London School of Economics. 

Dr. Kenneth J. Arnold, who has been doing war research work with the 
Columbia University Statistical Research Group has returned to his position 
at the University of Wisconsin. 

Dr. Lee A. Aroian, on leave from Hunter College is serving as a research 
associate in the Applied Mathematics Panel Project at Berkeley, California 
under the direction of Professor Neyman. 

Dr. Ernest E. Blanche, has been appointed to the teaching staff of the Army 
University organized by the War Department for American veterans at Florence, 
Italy. 

Assistant Professor Z. W. Bimbaum of the University of Washington has 
been promoted to an associate professorship. 

Dr. Alva E. Brandt has returned from the Operational Research Section of 
the Ninth Air Force in Europe. 

j^Vssociate Professor R. S. Burington of the Case School of Applied Science 
has received the Meritorious Civilian Award from the United States Navy. 

Dr. Irving W. Burr has been promoted to an associate professorship at Pur¬ 
due University. 

Miss Frances Campbell, after receiving her doctorate at Michigan in June, 
has returned to her position at George Pepperdine College, Los Angeles. 

Professor Harry C. Carver, after a year of service with the Army Air Forces, 
has returned to the University of Michigan. 

Professor W. G. Cochran has retuined to Iowa State College from a special 
mission to Germany. 

Professor Churchill Eisenhart, who has been doing war research work with 
the Columbia University Statistical Research Group, has returned to the 
University of Wisconsin. 

Miss Mary Elveback has been appointed to an assistant professorship at 
Rockford College. 

Assistant Professor C. H. Fischer of the University of Michigan has been 
promoted to an associate professorship, 

Mr. Elvin A. Hoy, who has spent three years with the War Production Board, 
is now Chief of the Statistics Section of the Bureau of Research and Statistics 
of the Social Security Board. 

Professor P. L. Hsu of Kunming, China, has been appointed to a visiting 
professorship of statistics at Columbia University, beginning January 1946. 

Dr. Doncaster G. Humm has received an honorary Doctor of Science degree 
at Bucknell University. 
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Mr. Joseph M. Juran who has served during the war with the Foreign Eco¬ 
nomic Administration, is now Chainnan of the Department of Administrative 
Engineering at New York University. 

Dr. Eugene Lukacs has been appointed Professor and Head of the Mathe¬ 
matics Department at Our Lady of Cincinnati College. 

Dr. R. V. Mises of Harvard University has been appointed to a professor¬ 
ship of aerodynamics and applied mathematics. 

Professor A. M. Mood has returned from Princeton University to his position 
at Iowa State College. 

Assistant Professor Henry Scheff^ of Syracuse University has been granted 
leave of absence to serve as senior mathematician with Princeton University 
Station of Division 2 of NDRC. 

Symposium at the University of California 

A Symposium on Mathematical Statistics and Probability was held at the 
Universitj'' of California at Berkeley on August 13-18, 1945. Those partici¬ 
pating in the symposium as speakers or chairmen were: 

Dean G. P. Adams, Prof. E. B. Babcock, Prof. E. M. Beesley, Prof. B. A. Bernstein, Prof. 
Egon Brunswik, Prof. A. H. Copeland, Prof. P. H. Daus, Lt. Comm. F. W. Dresch, Prof. 
G. C. Evans, Miss Evelyn Fix, Prof. Harold Hotelling, Prof. Victor F. Lenzen, Prof. Jay L. 
Lush, Prof. J. H. McDonald, Prof. George F. McEwen, Prof. J. Neyman, Prof. G. Polya, 
Prof. Hans Reichenbach, Prof. A. C. Schaeffer, Prof. Morgan Ward, and Dr. Jacob Wolfo- 
witz. 


New Members 

The following persons have been elected to membership in the Institute: 

Abbey, Helen, M.A. (Michigan) Stat., Bur. of Records & Stat. Mich. Dept, of Health, 916 
N. Chestnut, Lansing, Michigan. 

Acton, Forman, Ch. E. (Princeton) T/4 Army of the U.S., SED Barracks Area, Oak Ridge, 
Tmn. 

Aitchlson, Beatrice, Ph.D. (Johns Hopkins) Econ. & Stat. Aiialy., I, CC. 1929 S. St., 
N.W. Wash., 9, D. C. 

Anner, George, A.B. (Western Reserve) Stat. Ohio High. Plan. Sur., 576 So. 18thSt. fill92 
Arlington, Va. 

Bartlett, Maurice, D.Sc. (Ix)ndon) Univ. Lecturer, Cambridge, 1S7 Chesterton Road, Cam¬ 
bridge, Eng. 

Berwick, Leo, A.B. (New York Univ.) Capt., A. C. Asst, to Surgeon Stat. Unit of Psych. 
Sect. Hq. AFTRC, T & P Bldg., Fortworth 2, Texas. 

BlackweU, Asst. Prof. David, Ph.D. (Illinois) Math. Dept. Howard Univ. Wash., D. C. 

Borland, James, M.A. (Indiana) Capt., Ex. Officer, Inspect. Office, Pine Bluff Arsenal, 
Ark. 

Brown, Prof. Theo., Ph.D. (Yale) Bus. Stat. Harvard Bus. School, SoldiePs Field, Boston 
es, Mass. 

Bunke, Alfred, M.A. (Columbia) Sen. Stat. N. Y. State Dept, of Labor, S7 Parkwood St. 
Albany S, N. Y. 

Burlngton, Asso. Prof. Richard, Ph.D. (Ohio) On leave from Case School of Applied Bci> 
ence, Cleveland, Ohio, at Present, Head Math., Bu. Ord. USN 5200 N. Carlin Spring 
Rd., Arlington, Va. 
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Campbell, Jamea Ph.D. (Edinburgh) Univ. Math. Lecturer, Fictoria Univ> Cott, IFett, 
W.I. New Zeal, 

Chttrchlll, Edmund, A. M. (Columbia) 16B6 Union Port Roadj New York f, N. Y. 

Cornfield, Jerome, B.S. (New York Univ.) Slat. Dept, of LaboVy R.F.D. ifemdon^ 
Va, 

Cruden, Dorothy, A.B. (California) Stat. in Sampling Sect. Spec. Sur. Div^ Bur. of Census 
% Pop. Div. Wash., D. C. 

Daniel, Cuthbert, M.S. (Mass. Inst. Tech.) Stat. Eng., Carbide and Carbon Chem. Coi^., 
460 East Drive, Oak Ridge, Tenn. 

David, Florence, ^.D. (London) Univ. Sect. Stat. Dept. Univ. Coll., London, W.C, 1, 
England. 

De Garis, Prof. Charles, Ph.D. (Johns Hopkins) Univ. of Okla. School of Med., Okla. City, 
4. Okla. 

Echegaray, Miguel, C.E. Ag. Attache to the Spanish Embassy, 6700 16th St. N.W. Wash., 
D. C. 

Ede, Richard, B.S. (Wisconsin) Chemistry Devel. Metallurgist, (3ary Works, Car. Steel 
Ill. 647 Fillmore St., Gary, Indiana. 

Ewart, Robert, A.B. (New York Univ.) Research Physicist, Ballistics Dept. Des Moines 
Ord. Plant 68S-46th St. Des Moines 16, Iowa. 

Federer, Walter, M.S. (Kansas State) Research-Ag. Stat. Stat. Lab., Iowa State Coll. 
Ames, Iowa. 

Freeman, Richard, B.Sc. (McMaster) Research Chemist. 1 Maple Ave., Hamilton, Ontario, 
Canada. 

Goldrosen, David, B.S. (Worcester Poly Inst.) Lt. USNR Quality Control Officer, Insp. 
of Naval Matl. 604 Ward St. Newton Centre, Mass. 

Goodman, Albert, Supervisor Stat. Control, Quality Control, Westinghouse Elec. Corp,, 
Essington, Pa. 

Grant, Asst. Prof. David, Ph.D. (Stanford) Dept, of Psych., Univ. of Wis., Madison 6, 
Wisconsin. 

Greenhouse, Samuel, B.S. (City Coll. N. Y.) T/4 U.S. Army, 6816-lSth St. N.W. Wash., 
11, D. C. 

Gretton, Owen, A.B. (Brown) Acting Chief, Ind. Div. Sen. Econ., 10167 Old Bladensburg 
Road, Silver Spring, Maryland. 

Hayden, Byron, A.B. (Geo, Wash. Univ.) Econ. Stat. A. A. F. Wash. D. C. ISOl S. Cleve¬ 
land St., Arlington, Va. 

Hecht, Bernard, B.E.C. (City Coll, of N. Y.) T/sgt, 616 Corp., Army-Navy Electronics 
Stand. Agency 46 Washington Village, Ashury Park, N. J. 

Haufek, Lyman, M.B.A. (Northwestern) U. S. Army Hq. ASF, Chief Supply Stat. Unit, 
1161 New Hampshire Ave., N.W., Wash. 7, D. C. 

Kampschaefer, Margaret, A.B. (Indiana) Stat. Bur. of Labor Stat. 10S7 E. Blackford 
Ave., Evansville, IS, Indiana. 

Kozakiewica, Waclaw, Ph.D. (Warsaw) Inst, in Math., Univ. of Saskatoon, Saskatoon, 
Canada. 

Laguardia, Prof. Rafael, Director of Math A Stat. (Univ. of Uruguay) Fine Hall, Prince¬ 
ton Univ., Princeton, N. J. 

Leighton, Walter, Ph.D. (Harvard) On leave at Northwestern as Director, Applied Math. 
Group (NORC) Lecturer in Math. The Rice Inst. 1704 Judson Ave., Evanston, Illinois. 

Lieblein, J^us, M.A. (Brookl 3 m Coll.) Econ. Anal. Room 4013, U. S. Trea. Dept., 16th 
<fc Penna. N.W. Wash. 66, D. C. 

Lien, Roy, M.S. (Oregon State) Rate Stat., Northwestern Elect. Co., Portland, Oregon, 
S161 S.E. Division St., Portland 6, Oregon. 

Lonseth, Asst. Prof. Arvid, Ph.D. (California) Math. Dept. Northwestern University^ 
Evan., III. 
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Mioludttp, BiiCt Ph,D. (Univ. of Vienna) Math & ABtronomy Actitary, Apartado S48, 
Caracas f Venezuela. 

Monro, Sutton B.S. (M^. Inst. Tech.) Head of Str. Staff Unit. Amm. Div. Naval Ord. 

Lab. Lt. USNR S4SS Martha Cusiis Dr. Alexandria^ Va. 

Nilson, Htifo, Ph.D. (Minnesota) Chemist in Charge Fishery Tech. Lab. XJ. S. Fish & 
Wildlife Serv. College Parky Maryland. 

Nichols, Russell, B.A. (DePauw) Sergeanty U. S. Army Co. A. UBS A /, Kn APO BSSy 
NYC{SS^74S^). 

O’Neil, Frank (Lowell Textile Inst.) Senior Textile Technician, Worsted Division, Pacific 
MUlSy hawTencty Mass. 

Rappaport, Gladys, B.A. (Hunter) Jr. Stat. Stat. Research Group, Columbia, Univ., 
MO Tiehout Ave.y Bronx $7, New York. 

Rice, Assoc. Prof. Nelson, Ph.D. (C. U. of A.) SS»6 ISth St. A .A\, Wash., 17, D. C. 

Schell, Emil, M.A. (Western Reserve) Stat. Employment Stat. Div. 3440 N. 13 Hd. 
Arlington, Va. 

Schneberger, Richard, (Cert, to teach in Tech High School Training for Industry State 
Programs) %Edison Gen. Elec. Appl. Co., 5600 W. Taylor St., Chgo., III. 

Simon, Geo., Ed. M. (Harvard) Capt., A. C. Avia. Psych. Psych. Section, Surgeon, Ilq. 
AFTRC, Ft. Worth 3, Texas. 

Spaulding, Asa, M.A. (Michigan) Actuary & Asst. Sec. No. CarolinaMut. Life. Ins. Co. 
Durham, North Carolina. 

SpoerL Charles, B.A. (Harvard) Asst. Treas. %Aetna Life Ins. Co. Hartford, Conn. 
Springer, Wm«, C.E. (Columbia) Asst, Vice Pres, in charge of Research, Bristol-Myers Co. 
Hillside 6, New Jersey. 

Stock, J. Stevens, M.A. (American) Lt. USNR, Hd. Stat. Sect. Div. of Shore Est. & 
Civilian Per. Navy Dept., 8508 Garfield St., Bethesda, Maryland. 

Stott, Alex, A.B. (Harvard) Lt. Comdr. USNR, 3800 Devonshire PL, N.W., Wash. 8, D. C. 
Taylor, Thomas, Ph.D. (Yale) Research Engineer, U. S. Testing Co. 4^ Grover Lane, Cald¬ 
well, N. J. 
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REPORT ON THE RUTGERS MEETINO OF THE afSHfUn 

t- 

The Eighth Summer Meeting of the Institute of Mathematieal ^atistics 
was held at the New Jersey College for Women, Rutgers University, New Bruns¬ 
wick, Nea^r Jersey on Sunday, September 16, 1945, where the Summer Meeting 
of the American Mathematical Society was also being held. The following 
115 members of the Institute attended the meeting: 

C. B. Allendoerfer, R. L. Anderson, T. W. Anderson, H. E. Arnold, 1. L. Battin, Archie 
Blake, C. I. Bliss, P. Boschan, A. H. Bowker, A. E. Brandt, G. W. Brown, R. H, Brown, T. 
H. Brown, T. A. Budne, R. S. Burington, B. H. Camp, A. G. Carlton, P. C. Clifford, E. P. 
Coleman, T. F. Cope, G. M. Cox, H. B. Curry, J. H. Curtiss, J. F. Daly, J. H. David^, B. 
B. Day, W. E. Deming, H. F. Dodge, Jacques Dutka, P. S. Dwyer, Churchill Eisenhart, 
Wade Ellis, Mary Elveback, Benjamin Epstein, C. D. Ferris, C. H. Fischer, M. M. Flood, 
R. M. Foster, Milton Friedman, J. P. Gill, M. A. Girshick, Casper Goffman, A. A. Goodman, 
Dorothy K. Gottfried, T. N. E. Greville, F. E. Grubbs, K. W. Halbert, Marshall Hall, P. 
R. Halmos, Miriam S. Harold, Millard Hastay, Bernard Hecht, William Hodgkinson, 1. S. 
Hoffer, Harold Hotelling, A. 8. Householder, W. Hurwicz, Irving Kaplansky, C. J. Kirchen, 
Jack Ijaderman, Rafael Laguardia, H. G. Landau, Howard Levene, Harriet Levine, 8. B. 
Littauer, A. T. Lonseth, P. J. McCarthy, W. G. Madow, J. W. Mauchly, E. B. Mode, D. J. 
Morrow, J. E. Morton, Judith Moss, P. M. Neurath, M. L. Norden, H. W. Norton, C. O. 
Oakley, P. S. Olmstead, Edward Paulson, John Riodan, H. E. Robbins, H. G. Romig, Viliam 
Salkind, M. M. Sandomire, Arthur Sard, F. E. Satterthwaite, L. J. Savage, Henry Soheff4, 
Bernice Scherl, Edward Schrock, I. E. Segal, C. E. Shannon, L. W. Shaw, Herbert Solomon, 
Mortimer Spiegelman, J. R. Steen, Arthur Stein, F. F. Stephan, A. P. Stergion, L. V. 
Toralballa, Mary N. Torrey, A. W. Tucker, L. R. Tucker, J. W. Tukey, Helen M. Walker, 
W. A. Wallis, R, M. Walter, B. T. Weber, Joseph Weinstein, A. E. R. Westman, Frank Wil- 
coxon, 8. S, Wilks, Jacob Wolfowitz, C. P. Winsor, Ruth Zwerling. 

The first session, on Sunday morning, was devoted to a symposium on iSc- 
quential Analysis, Professor W. Allen Wallis, of Stanford University and Colum¬ 
bia Statistical Research Group, acted as chairman for this session. The fol¬ 
lowing invited addresses were given. 

1. Theory of Sequential AnalysU. 

Professor A. Wald, Columbia University and Columbia Statistical Research Group. 

2. Construction of Multiple Sampling Inspection Plans for Attributes from Sequential 
Principles. 

Mr. Milton Friedman, National Bureau of Economic Research and Columbia 
Statistical Research Group. 

3. Applications of Sequential Analysis to the Ranking of Two Populations with Respect to 
a Single Parameter. 

Mr. M. A. Girshick, Bureau of Agricultural Economics and Columbia Statistical Re¬ 
search Group. 

The morning session was concluded after lively discussion on the symposium 
topic. 

Dr. W. Edwards Deming, of the Bureau of the Budget and President of the 
Institute, presided at the afternoon session. The following papers were pre¬ 
sented : 
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1. On The Variance of a Random Set in n Dimensions. 

Br: Herbert £. RobbinB, Post Qraduate School, Annapolis. 

2. The Non-Ceniral Wishart Distribution and its Application to Problems In Multivariate 
Statistics. 

Dr. T. W. Anderson, Jr., Princeton University. 

3. The Effect on a Distribution Function of Small Changes in the Population Function. 

Professor Burton H. Camp, Wesleyan University. ^ 

4. On Composite Distributions. 

Dr. Casper Goffman and Dr. Benjamin Epstein, Westinghouse Electric Corp* 

5. Population^ Expected Values and Sample. 

Professor Emil J. Gumbel, New School for Social Research. 

On the Selection of a Sample in Repeated Steps. 

Dr. W. G. Madow, Bureau of the Census. 

7. On Optimum Estimates for Stratified Samples. 

Mr, Morris H. Hansen and Mr. William N. Hurwitz, Bureau of the Census. Presented 
by Margaret Gurney. 

8. Pearsonian Correlation Coefficients Associated With Least Squares Theory (Presented 
by Title). 

Professor P. S. Dwyer, University of Michigan. 

The afternoon session concluded with the report of the Committee on the 
Teaching of Statistics which was presented by Professor Harold Hotelling of 
Columbia University. 

P. S. Dwyer, 
Secretary 



ON THE NORMAL APPROXIMATION TO THE BINOMIAL 
DISTRIBUTION 

By W. Feller 
ComeU University 

1. Although the problem of an efficient estimation of the error in the normal 
approximation to the binomial distribution is classical, the many papers which 
are still being written on the subject show that not all pertinent questions have 
found a satisfactory solution. Let for a fixed n and 0<p<l, ^=1— 

( 1 ) ©!>•«-, 

For reasons of tradition (and, apparently, only for such reasons) one sets 

(2) Zk - (k - np)(r\ a = 
and compares (1) with 

(3) iV’fc =* (2ir)'"^^“ and Ilx.r = 

resp)ectively,‘ where ^»( 2 ) stands for the normalized error fimction. Many 
estimates are available for the maximum of the difference | Px,y — Ilx.r | for all X, v. 
Now this error is 0(flr“^) and even a precise appraisal will break down in the two 
most interesting cases: if <r is small, or if X and v are large as compared to v. 
Indeed, even for moderately large values of k (such as are usually considered) 
the contribution of Tk to the sum in (1) will be considerably smaller than 
so that any estimate of the form 0(<r') leaves us without guidance. With some 
modiffcations this remains true also for more refined estimates like Uspensky’s 
remarkable result* 

(4) Px,, =* IIx.p + ^ " 

with 

I cu I < {.13 -h .18 I p — q' l)<r * + e 

provided v > 5. What is really needed in many applications is an estimate of 
the relative error, but this seems difficult to obtain. 

It should also be noticed that the accuracy of the normal approximation to the 
binomial is by no means quite as good as many texts would make appear. Exam- 

* Very often the limits z\ and Zp instead of -H ~ and zx — are used. This naturally 

2fa 2a- 

results in an unnecessary systematic undervaluation. 

* Uspensky 13], p. 129. A two-term development of Tr with an error of 0(v”*) valid for 
I X I < 2, 0 - > 3 has been given by Mirim 9 ,noff and Dovas [1927]. 
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pies using p = i and intervals which are symmetric with respect to np are hardly 
conclusive, since there the main error term drops out and systematic positive 
and negative errors cancel. Again, in practice comparatively small <r and com¬ 
paratively large p are frequently used. It works well to compare a Px,r of a 
numerical value, say, .93 with a corresponding value IIx.^ of, say, .95. In class¬ 
room discussions the error may seem insignificant. However, in most actual 
applications one would consider the complementary probabilities, and the yery 
same figures mean an approximation .05 to the correct value .07. If a confidence 
limit is set to the five per cent level, the normal approximation would in our 
example mean that two out of seven critical cases are missed. Consider next the 
example p = A, n = 10,000. For values of k around 1120 the relative error of 
Nk is about .30; it increases rapidly with increasing k. Around k — 1150 the 
relative error exceeds 2/3, around 1180 it is nearly 1.4. And yet this example 
is conservative in comparison with many cases where the normal approximation 
is used in practice. 

It is surprising that the classical norming (2) is generally accepted although 
there does not seem to exist any deeper reason for it. The use of moments, 
though Usually very convenient, does not necessarily lead to best results. For 
example, the density function 

(5) /„(x) = 

\b the (n + l)-fold convolution of /o(x) with itself and therefore, for large n, 
of nearly normal ‘^type.’’ The conventional norming would approximate 
/n(x) by [2v{n + wliile the use of the norming factor n 

instead of (n -}- 1) seems clearly indicated. 

Actually, as will be seen, it is natural (at least for small values of k — np) 
to replace (2) by 

(6) a:* = {A; + I — (n -f- l)p)cr \ 

and accordingly to approximate Fx.i^ by the error integral taken betw een the limits 

(7) {X — (n + l)p!<r““^ and {v -f 1 — (n -f l)p)<r“\ 

For example, let p = n = 500, X = 50, == 55. The correct value is P «).65 ^ 

.317573; the norming (2) leads to 1160,66 .32357, while the more natural limits 

(6) lead to an approximation .31989. More important are the quite unexpected 
simplifications which the norming (6) permits w'hen one studies the error for 
large Xk or small cr. 

We are now led to reformulate the problem: instead of starting with arbitrary 
limits for the error integral and to estimate the resulting error, we shall try to determine 
the limits so as to minimize the error. Theoretically, for any given X, v these limits 
could be determined so as to give an exact value for Px.y. However, such limits 
would depend in the most intricate way on X and v. For practical purposes one 
would restrict the considerations to certain simple functions such as polynomials. 
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We stall here' c€«isider only the case where the limits are at most qiiadrati 
polynomials^ Essentially our problem seems that treated by l^mstdn 
(and) apparently, cmly by him). In a series of papers since 1924, S. Bemstiein 
has * Gonsid^ed the accuracy of the normal approximation. Quite ,recently* 
he has, by a considerable computational effort, extended the range of validity 
from npq > 365 to npq > 62.6 and proved the following 
Theorem (S. Bernstein ): Let 

(8) npq > 62.5 

and let a*, he the solutions of the quadratic equations 

X - \ - np = axinpqf'^ + al 

(9) 

X + J - np = • 

If 

(10) a > 0, |3 < 2"‘'(npg)''‘ 

then 

(11) ^ Px.. < 4>(a.) ~ ^(ax). 

The conditions (10) are practically equivalent to 

(12) X > wp + I, v <np+ 2^'V'\ 

The remarkable feature of this excellent result is that the error remains 0(<r“*) 
throughout an interval which increases with cr (instead of the conventional uni¬ 
formly bounded intervals). 

In the sequel it will be show n that startling simplifications can be obtained if 
the norming (6) is used from the beginning instead of (2). Our main result is an 
improvement of S, Bernstein’s theorem. The condition (8) will be replaced by 
(n + l)pq > 9. The first condition in (10) xvill be relaxed iok > {n+ l)p, that 
is to say, our theorem will hold for all k exceeding the central value (for. those less 
than the central value an analogous theorem holds); in the other condition (10), 
the numerical value 2^^^ will be replaced by an arbitrary constant. Instead of 
quadratic equations, we shall consider quadratic polynomials. And finally, the 
gap between the tw^o sets of limits will be reduced. 

It will be seen that the computations leading to this improvement are almost 
negligible in comparison with S. Bernstein’s deeper method; with slightly more 
sophisticated arguments and numerical evaluations, our results can be con¬ 
siderably improved. Our consideration will be based on a new expression for 
Tk , in which only exponential terms appear but the usual square root is missing. 

’ 8. Bernstein [1], the first paper of the series appears to have appeared in Ucenye Zapiski, 
Kiev, 1924. 
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In passing from approximations to Tk to approximations to (me has to 
replace siUAB by integrals. This procedure is cumbersome if an estimate of the 
relative error is desired. Euler’s formula and other standard formulas are of 
little use. We shall therefore start with a lemma which, it is hoped, may be 
useful in this connection; it will therefore be proved in a slightly more general 
form than actually required for the present paper. 

2. Lemma^ 1. For 0 < h < i and | | < 1 


/.z+A/f 


880 - - 285 


Proof. Denote the integral in (13) by J. Then 






0hl2 

= / ckxte~‘*'^ di. 

Jo 


We begin by showing that for 0 < a < § 

(16) ga«/2-«*/ll < 

In fact 

(. 7 , > (, + n) ^ ‘ + 5 + T ? (0 ^ 

and 

> (x +1 + ^X‘ - h) ^ ‘ + T + S' ,-:V ^ 


It follows from (15) and (16) that 


> 2h~ 


^(x*-l)<*/3-4z*eV66 


^(x*~1)<»/8-x<M/66 ^ — 1 ^2 

e _ 


mnri 

> 2 hr^ I 

Jo 


which proves one part of the lemma. 

To obtain an upper estimate we make use of the inequalities 


^(x*-l)«*/8 




-*t/8-hp4|4/i8 


4 The fraction i is chosen quite arbitrarily; if h be restricted to 0 < ^ < 1 the first member 

of (14) remains unchanged, while the fraction — on the right side has to be replaced by —. 

285 264 
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( 20 ) ^ (l + ^*^1 - ^ 

^ “3 

^xU*iis+h*(m 

Using (16) and (20), the proof of the second part of the lemma follows from a 
computation analogous to (19). 

For our purposes it is convenient to use Stirling’s formula in a form which is 
not quite the usual one. 

Lemma 2. {Stirling's formulaa). For n > 4, 

(21) = ( 2 ir)^(n + 


i(.+ 


- 1 


or 

(22) n! = 
where 

(23) 1I < i, » 0 as n —^ 00 . 

Formula (21) can be derived from the gamma function or in any other way 
that leads to the standard form (22).® 

3. From now on we shall put 

(24) or^ = (n + l)pg 

(25) xjfc = {fc + i — (w + l)p)(T 


the subscript k will l)e omitted whenever no confusion is to be feared. To trans¬ 
form Tk we shall use (21) for the factorials in the denominator, but (22) for 
(n -h 1)! in the numerator. 


® A simple proof runs as follows. Put Bn -* n!(n 4- Then 

1 _ V _i-- \ -i— - ^ 1 -b 

2^i2y + 1) J (2^)*- “ 60 (2p)^ 

with 0 < 5i < ;^ if p ^ 5. From here (21) follows using the fact that 
V log - log - i log (2t) 

p-n+1 

and that for n 4 


1 - 6 * 1 1 
3(n + 4)* ^ p‘ 3(n + 4)« 


with 0 < 2 < In this Way the estimate (23) can be considerably improved. 
25 
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Then 


log {{2rf <rT,) = (n + 1) log (n + 1) - (fc + J) log —ti 

P 


(26) 


(27) 


(n - k +i) log - -i + 


1 


+ 

+ 


]2(n + 1) 24(A: + i) 

1 _ _ 
24(n - Jfc +T) ~ '* 




+ PS. + 

]2<t“ ^ 


24<r» (l + 24.^ (l - " 

0 ^ ^ 7 / 1 , 7 r 1 1 1 

- ^ - 6 \360(n + 1)’ 2880 [(^ + iY (n - k + §)»J 


1 I 




provided only that /c > 4, (n — /p) >4. Asymptotically p is equivalent to the 
right-hand member ^\^thout factor J (which, by the way, could be replaced by 
1 + A)* Obviously 


if fc > 4, n — /c > 4. We shall consider later on the case > 3, I -r I < Ui 
then clearly A: > 4, n — /c > 4, so that the use of (28) will be justified. Expand¬ 
ing (26) into a power series we obtain 
Theorem. // A; > 4, n — > 4, 


n = (27r)'^cr^ exp^- 


(29) 


24(7^ ? 


( - qr' iL 

^.--2 

') +1+2?? 


Kv - 1) 

(- q)' 


24<7“ 


— P 


where p satisfies (28) {and (27)); x and a are defined by (25) and (2t), respectively. 
Each term of the second series will usually be small as compared to the cor¬ 
responding term of the first series; the second series can therefore, if desired, he 
absorbed in the error term. If x is small the first term of the first series will be 
preponderant. However, as x increases, more and more terms ^^ill make them¬ 
selves noticeable; if a; three terms mil be essential, and so on. 

Formula (29) permits us to approximate P\,v by means of integi-als. The 
tangent rule would suggest to compare to 


( 30 ) 




APPROXIMATldN TO THE BINOMIAL 


325 


and (29) together with lemma (1) permits easily to ^timate the rejoto’ee error 
in the practically most important cases. It is also seen that the limits in (30) 
are essentially the only limits depending linearly on X and v which will render the 
relative error for x = 0(1). Instead of elaborating on these simple 

questions we proceed to the more intricate problem of limits which are quadratic 
polynomials in X and v. 

4 . For brevity we shall from now on put 


(31) 



The estimate | a | ^ 4 will be used constantly. It obviously suffices to consider 
values of X < which exceed the central value [(n + l)pl. 

Theorem. Suppose that 

(32) cr > 3 

and 


(33) X > (?i -f- l)p J' + 2 ^ (^ + l)p + 
Then 

(34) Px,. < e ), 


if 

(35) 




k 


{n 4- l)p ^ a j k — (n + l)p ' 

<y <r \ O’ 


2a __ 1 
<r 


while the inequality in (34) is reversed if 


(36) 17 . 
where 

(37) 


k - 


(n 4- l)p ^ g / A- - (n + l)p 
(T (T \ cr 


+ 


^ M 2. 

6(r la 


fr =: + I - (^ + 

<r <r* 


The gap between the limits (35) and (36) is 0(cr"'^) if xj = 0(<r). In S. Bern¬ 
stein ^s case (12), M < y/2 and the gap is about 2/(5<r). It will be seen from 
the proof that it requires only routine computations to improve the correction 

term in (36). 

Proof. Put 


(38) 


y I U 2 

= T. H— Xk^ 
a 


again suppressing the subscripts wherever convenient. As a consequence of 
(33), we shall be concerned only with values Xk satisfying 


1 ^ .2 


( 39 ) 
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Consider first the main series in (26) and write 


r K- -1) 


’-if + -A, 


where 


4 + ~ (-qT 

\ 12 2jcr^^^ 


We shall require some estimates of A. First consider the case a > 0. Then all 
terms of the series are positive, while the expression within parentheses assumes 
its minimum for p = By (39) t < V whence 

(42) ^ 

If o < 0 the signs in the series (41) alternate, each negative term being smaller 
in absolute value than the preceding positive term. Therefore, using (39), 


r^3 1 4 4 

( 43 ) 

The expression within braces is a cubic in p which assumes its minimum for p == 
(1 + \/793)/72 = ,405.... It follows that 

<«> 

(half of this estimate would actually suffice for oui’ purposes). On the other 
hand, it is evident from (41) that the ratio A/x* attains its maximum for p = 1. 
Therefore, using (39) 

<«> 


Next we write 




Z {p'"^ - (-gr‘) 


( sr -^ 


i + B, 


whence 


( 47 ) + 

A trivial computation analogous to (43) shows that B > 0. Again, if a < 0, 
the signs in the series (47) alternate and in this case 

(48) 0<B< 
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If a > 0 we can majoriee (47) by a gemnetiie series and obtain 




Now put 


» <r 




(61) (k + iAfifc *= fo+i — iAf*+i 

so that the intervals with endpoints (k db iA(k are non-overlapping and con¬ 
tiguous. Clearly 


= <r” 




Introducing (40), (46), and (52) into (29) we obtain 


Tjt = (2 t)-^'^ AJ.exp^ 

I <T 4<r 




To appraise the logarithmic term we write 




C f ^ attains its maximum value when a = — J, and it is readily seen that 

0 < C < if o > 0 

<r 

0 < C < it „ < 0. 

(7* 

Finally we put, with a parameter u to be determined, 


2 / * $ + 


2a — w 


Aj/ =* A{. 


If one puts 


2cr 4(j* 


and rtk is defined by (36), then 

(58) jfk + i^Vk = yk — iA^fc = rik . 
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On the other hand, if 

(59) 


ilf _ 1 _ a 
6 7 4 ? 


and T?jb is defined by (36), the identities (58) hold again. Accordingly, all we have 
to show is that, with u defined by (57), 

(60) T, < + iAy,) - HVk - 

and that the inequality in (60) is reversed if u is defined by ( 59 ). 

Elementary transformations lead from (53) to 

(61) T, = ( 2 ,)-‘ Ar exp |-|' + (y' - 1 ) + — 3 ^ + , 

where 


(62) E 


u — 4an 




— A -j- -j- (7 — 


p- 


Let now u be defined by (57). In view of lemma 1 and (61), the inequality 
(60) will be proved if we show that 


(63) 

Now clearly 

(64) 


E, 


E + «^^'<0. 


(i 4- ^ y*(.^y)' 

W \ a J ' 24 - 880 ■ 


Moreover, introducing the estimates (28), (32), (42), (44), (48), (49), and (55) 
into (62) it is seen that for o > 0 


( 66 ) 

and for o < 0 

( 66 ) 




,>6< + 


The derivatives of the right-hand members in (65) and ( 66 ) are both negative 
for f > 0. Now we are interested only in values x Satisfying (39). For such 
107 _ 107 


values f > 


216(r’ 


For i 


216<t 


the right-hand members in (65) and ( 66 ) are 


negative, so that Ei < 0 for x > — . This proves the first part of our theorem. 

The proof that with (59) the inequality in (60) is reversed proceeds on similar 
lines. We have to show that 


E 2 = E 


(Ay)* 

285 ^ 


> 0 . 


(67) 
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Suppose that o <_0, wWch is the less favorable case. ITien, by (45)* (37), and 
(39), 


( 68 ) 

Similarly 
(69) . 


15 O' 20 O’ 




Using (62) we have therefore, neglecting the non-negative terms B and C, 


(70) 




2<r“ 2ic* 


u 

3a» 


1 

250a^ 



_ .W 

72<r» 20(r 12a*J 


1 

24 < 7 » 



The expression at the right side represents a parabola, and it suffices to show that 
it assumes positive values at the endpoints of our interval (39). Now 


(71) 



and simple arithmetic shows that, with (69) the expression within the braces 
more than counterbalances the negative terms outside.® If o > 0 the situation 
is more favorable and the estimate (59) can then be further improved. 
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® A more careful computation shows that it suffices if we put u « 7^ instead 
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THE VARIANCE OF THE MEASURE OF A TWO-DIMENSIONAL RANDOM 

SET 


By J. Bronowski and J. Neyman 
Princes Pishoroughy England and the University of California 

1. Introduction. In a recent paper H. E. Robbins^ has solved the problem 
of the variance of the measure of a one-dimensional random set. The present 
paper treats a similar problem relating to a two-dimensional random set under 
somewhat more general conditions. 

Let R denote a rectangle of dimensions a X b whose position is fixed. Let /2' 
denote another fixed rectangle concentric with R, its sides a + y and h + y (where 
7 > 0) being parallel to the sides a and h respectively of R> Finally, let p denote 
a rectangle of fixed dimensions but variable position, whose sides a < 2y and 
0 < 2y are parallel to a and b respectively, but the position of whose center will 
be considered as random. In fact it will be assumed that the rectangle p is 
dropped on the plane of i? in a manner which satisfies the following two 
assumptions: 

(i) The probability that the center of p falls within R' exactly s times has a 
defined value for each s = 0, 1, 2, • • • Thus, if ^(?4) denotes the probability 
generating function of s, so that 

(1) ’!'(«) = L U'P,, 


then ^(u) is assumed known but will be left arbitrary till the general result is 
obtained. 

(ii) Whenever a fixed number s of centers of p fall within R\ it will be assumed 
that the probability that exactly k centers of p fall within any chosen sub-area w 
contained in R' is given by the binomial expression 


( 2 ) 


klis - k)lR 


tp*/ w\’ 

\W\ w) 


Under the above conditions, denote by E the set of all those points of R whi(;h 
are covered at least once by the rectangle p during the course of the trials con¬ 
sidered. Let X denote the measure of E, The purpose of this paper is to 
evaluate the first two moments of X. 

First, the computations will be made for the case when s is fixed, i.e. when 
(3) ^(u) = u*. 


The values of the two moments of X computed for fixed s will be denoted by 
Jlfi(a,6|5) and M 2 iafi\ 8 ). Next, the moments of X will be evaluated for an 
arbitrary generating function and these will be denoted by Mi(a, h) and 
5). 


* H, E. Robbins, ‘‘On the measure of a random set", AnmU of Math. Stat.f Vol. 15 
(1944), pp. 70-74. 
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H. E. Robbins has found the first mmnent 

(4) Mxia, b 1«) = objl - (l - 

Also, for a one-dimensional set, he has obtained the second moment, say Ms(a|s),. 
when ot < a. 

It follows immediately from (4) and (1) that, whatever be the probability 
generating function ^(u), 

( 5 ) 

In particular, if the probabilities P, are those of Poisson when the density erf 
positions of the center of p per unit of area is X, so that 

( 6 ) nu) - 

then 

(7) Mi(a,h) = a5{l - 

Our remaining problem, therefore, is that of evaluating the second moment of 
X, Instead we shall evaluate the second moment of 

(8) 7 = ob - X, 

and shall denote it by m(a, h | s) or m(o, h) according as s is or is not considered 
to be fixed. 


2. Derivative of the second moment of 7. In order to evaluate w(o, 6), we 
begin by calculating its second (mixed) derivative, say D(a, b | «), where 


— lim {m(a + Aa, b + Ab\s) — w(o, 6 + Ab | «) 
(9) 0 


~ m(a + Ao, b I «) + w(a, b | s)} 


= fina /(Aa, Ab) (say), 

where Aa and Ab are the increments of a and b respectively. Once D(a, b | s) 
is found, the formula for m(a, b | s) will be obtained by two quadratures. For 
definiteness we shall assume Aa and Ab both to be positive, but of course the 
argument which follows applies equally to other cases. 

Consider the rectangle of dimensions (a + Aa) and (b + Ab) as shown in Figure 
1, and denote by U, V and W the measures of the “uncovered” parts of the three 
rectangles Aa X b, a X Ab, and Aa X Ab respectively. That is to say, C/, V 
and W are defined with respect to these three rectangles precisely in thejsame 
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manna* in which Y is defined with respect to the original rectangle a X h ^ 
R, Using the letter E to denote the expectation, we easily find that 

7(Aa, Ab) = 2E(YW) + 2E(UV) 

+ 2 E(yW) + 2E(UW) + E{W^). 

However, each of the three expectations in the second line of formula (10) is 
infinitesimal of an order higher than the product AaAh. In fact, none of the 



Figure 1. 


variables U, V and W can exceed the area of the rectangle of which it foims part; 
that is, 

0 < U < hAa, 

( 11 ) 0 < 7 < aA6, 

0<W < AaAh, 

It follows that 

0 < E(UW) < hiAafAb, 

(12) 0 < E(VW) < aAa{Ah)\ 

0 < E(W") < (AaAh)\ 

Hence, from (9), (10) and (12) 

(13) D(a, M s) = 2Uin \E{YW) + E{UV)]. 

We now reduce the calculation of (13) to finite form by approximating to the 
infinite sets F, f7, 7, T7 by progressively more ample but fi^te sets. To do so, 
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we cover ft' by progressively more ample but fiiiite networks of piokits. Mor^ 
precisely: consider a rectangular system of axes Of and Oij oriented as in Figure 1 
so that the axes are common boundaries of a X h ^ R and of the rectangles ob¬ 
tained by increasing a and b. Let 

(14) dn = a/(n + 1), 6n “ h/(n + 1). 


Consider the lattice of points (y) with coordinates 

(15) = 

for i = -v}"' + 1, • • • , 0 , 1, 2 , • • • , n; i = + 1, • • • , 

0 , 1 , 2 , • • • , n, where Vj”’ and j/*"’ are the greatest integers such that 

(16) < Aa 


and 

(17) < Ab. 

To simplify the writing, the superscripts (n) will henceforth be dropped. 

With every point (ij) we associate a random variable Xij defined as follows. 
If in the course of the trials contemplated none of the rectangles p covers (ij), 
then Xij = 1 . Otherwise Xij = 0. Further, write 

Fn = dn5n 2 ]C > 

»--0 ;-0 


(18) 


Un = d»«» £ £x„, 

*—VI 7—0 
n 0 

V n ~ dn 5n ^ ^ ^ Xij , 

♦-0 7—V 2 

= dn«» £ £ 

—V| 7’«^»* 


Now the boundary of the set E, for a fixed a, consists of one or more polygons 
having a finite total number of sides each of bounded length. It follows that, 
given any € > 0, there exists, for a fixed a, a number iV.(a) such that n > iV’,(a) 
implies that 


(19) 


\Yn-Y\<t, 


with similar inequalities relating to t/„ , Fn and IFn . Hence it follows imme¬ 
diately that 


lim ft(FnlFnla) « ft(FTFla), 

n-»op 

lim E{UnVn\a) = EiUVla}. 


( 20 ) 
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The expectations in formula (13) will therefore be obtained as limits of those 
on the left hand sides of (20). We have 

(21) E{Yn TT, I s) = disi E i E(x,j Z E a:« |«), 

VI ,Wv, \ it-O J-4) / 

( 22 ) E(U,Vn\-8) = dls’, i t,E 

Hitherto we have made no assumptions concerning the values of Aa and Ah, 
Since these are to tend to zero, we may assume that 

0 < Aa < 7 — a/2, 

(23) 

0 < A6 < 7 - p/2, 

On this assumption, we shall now compute the expectations of the type 
E{xijXki I s), of which ( 21 ) and ( 22 ) are linear combinations. 

Since the variables x.y and Xki are capable only of the two values unity and 
zero, the expectation of their product is simply the probability that both of them 
are equal to unity, i.e. the probability that both points (ij) and (kl) are “missed*' 
by all the 8 rectangles p falling on R\ This probability may have one of two 
forms. If both 

(24) d„\i - k\ < a and | J - /1 < jS, 
then 

(25) E{XiiX„ I s) = |l - I) (g - 8. li - ^ I )y. 

while otherwise 

(26) EiXiiXu I 8) = ^1 - ; 

in each case, in virtue of the assumption (ii) of Section 1 . 

The essential content of equations (24) to (26) is that, once the other variables 
appearing in them are assigned, E{xifCki | «) is a function only of the differences 
i — k and j — 1. It is this fact which allows us to evaluate the limits of the 
quantities in ( 21 ) and ( 22 ) in a simple manner, in effect by holding one of the 
two freely variable points (v), (kl) in a fixed position, say at the origin. Thus, 
let 

(27) Eie, 18 ) = dUl E E E E 18 ). 

Owing to the remark just made, the expectation 

( n+t n+i \ / n n \ 

E E ^*11«) = E E I «) 

I-; / \ t-o 1-0 / . 



( 28 ) 
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and it follows that 

I * (Vi + 1) (V% + 1) dn 5 n 2 ^ I 


(29) 


[(Vi + 1) (^1 + l)dn5n) j^dn^nS ^ E{XwXki | ^)J* 


Of the two factors in the square brackets in (29), the first tends to AaAh as n 
tends to infinity, and the seccRid tends to the integral 

(30) 


where 

(31) 


/({,*») - 


2 oifi — (a — f) (|8 — n) 

R' 


if both 0 < { < a and 0 < >> < /S, and 

(32) /({,,) (l - ^) 

otherwise. Thus the computation of the limit of E($n | s) is straightforward. 
It remains to show that it differs from that of E(YnWn | «) in equation (21) by 
an infinitesimal which is of an order higher than the product AaAb, 

Since the variables Xik are capable only of the two values unity and zero the 
absolute value of the difference between the brackets in (21) and (27), that is, 
between 

n n n+« n+j 

(33) Xij Z) ^*1 aad Xij 22 £ 

kmmO Jb-i* I—J 

cannot be greater than — n(t + j) < ^(t^i + t;*). It follows that 

(34) I EiYnWn I «) ~ E{$n | «) | < [dn5n(Vl + i)(V2 + l)]NnMn + 

As n tends to infinity, the right hand side of (34) tends to the product 


(35) 


AaAb[hAa + aAb]; 


whence 


(36) 


lim EiOn | «) I = lim 11 E^YW \ a) 

ia,d&-»o aaao da.A6*^o Aaao 

- f fr((,v)dsdf,. 


A very similar procedure will serve to evaluate the limit of E(UV | 8)/AaAb. 
Here, we replace the two freely variable points (ij), (fcZ) by two semi-fixed points, 
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one being restricted to the axis Of and the other to the axis O 17 . More precisely, 
instead of coi^dering E{UnVi, ( s) in equation ( 22 ) we consider, say, 

(37) i: E a:*,) 

.VI j -0 \ *-< / 

and it is easy to see that 

(38) . Um I E(JJn I 8 ) - E{K | s) | < 6 (Ao)‘ (A 6 ), 

n--»oo 

SO that the quantity (37) may be used in equations (13) and (20) in place of the 
quantity (22). However, since E{xijXki | s) depends only on the differences 
t — A; andy — A, 

(39) £?(x.i£ E x*,)=£(xoiE E x«) 
and therefore 

(40) E{4>r. I 8 ) = {4.(t-, -HI)}/ dni\ E E E X*, I 8 ')) 

Further, and in the same way, we may replace the sum in (40), namely 

(41) E-®(xoyE E X*j|s)=E E ^(x*| E Xoy I 8 ^ 

by the simpler sum 

E E ^?(x*j E xo,|8) = (v2 +1) E-^IxwExo/is) 

fc—0 Z-»—«a \ J—i / fc-iO \ J—0 / 


(42) 


(^2 + 1 S ^ (^*0 ^oy I s). 


It follows that we may replace the limit of E{UnVn | s) as expressed in (22) by 

(43) lira {d„ (vj + 1)8. («! + 1)) {d,S„ E E E{xmXai ] 8)1, 

n-*ae fc-0 ;-0 } 


and this is easily found to be equal to 

r r 

Jo JO 


(44) 


AaA6 f f f{i,ri)didri, 

Jo Jo 


where/(t, v) is defined by the formulae (31) and (32). 

Collecting this result with that expressed by (36), and substituting in equation 
(13), we therefore have finally 


( 46 )’ 


i)(o,M«) “ 4 r f r((,v)d(dv. 

JO JO 
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3. The forma ef the derivative. Since the function /({,v ) ^ two different 
forms (31) or (32) depending on the relationships between a, b, a and will 
be necessary to distinguish foiu* different forms of the derivative (45), and of its 
integral. 

First, for values of a and b for which simultaneously 


(46) 


a < a and h < 


the integrand in (45) has the form (31) for the whole region of integration. 
Hence the value of D(a, b | s) in the region (46) is given by, say 


(47) 


where 

(48) 

= 4 / / 

•fa—a •ffi—b 

git, r) ■: 


2afi — (a — {) (iS — v) 
R 


)■ 


did/q 


2aP — tr 


Next, when a ^ a but b < /3, the integrand in (45) has the form determined 
by (31) only when 

(49) 0<{<a, 0<i?<b, 

>vhercas when 

(50) a<f<a, 0<T7<b, 

the appropriate form is that determined by (32). Therefore here D(a, b | s) 
has the form, say, 

(51) D, = 4b(a - a)(l - + 4 j[“ g’{t, r) dtdr. 

Similarly, for 

(52) a < a but b ^ iS, 

Z>(o, b I s) is given by, say, 

(63) Da = 4a(6 - /J)(l - ^y + 4 ^ g\t, r) dtdr. 

Finally, in the region in which simultaneously 

(54) a ^ a and b ^ jS, 

Z)(a, b I s) has the form, say, 

(55) A = 4(o5 - - ^y + 4 j[“ g\t, r) dtdr. 
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4. The second moment of K. We have now to determine w(a, 6 | «) for all 
non-negative values of a and 6 , from the equation 

The general solution of this equation is 

(67) m(a, ^ I«) = jf -DC®, b | s) dadb + 4(o) + B(b), 

where A (a) and B(b) are each functions of one variable. These functions are 
determined by the boundary conditions, namely 

(58) m(a,0|«) - m(0,6|*) = = o, 

da do 

which are a consequence of the inequality 0 < Y < ah. It is then easily found 
that the only solution m(a, b j s) satisfying (57) and (58) has the following four 
different forms, depending on the values of a and b. 

If a < a and *6 < jS, then 

(59) m(a, ^ I v) dxdy = Wi(o, b | s) (say). 

If a ^ a and 6 < /3, then 

m(a, 6 I a) = mi(a, h\s) + f f Dsix, y) dxdy 

Ja Jo 


m 2 (a, b 1 a) (say). 


If a < a and b ^ p, then 


m(o, 6 I a) «= mi(a, /8 I a) + f f Dsix, y) dxdy 

(61) •'0 JP 

- mt(a,b\s) (say). 

Finally, if a ^ a and b P, then 

m(a, 5 I a) * mi(a, /3 | a) + J j Dzix, y) dxdy + ffD ,(*, y) dxdy 

(62) “ ' » * ' 

+ 11 Dt(x, y) dxdy = mi(a, b |«) (say). 

Ja Jfi 

The procedure used to evaluate the integrals (59) to (62) follows the same 
general pattern, and we shall confine ourselves to outlining it in one case, say (59). 
There 

mi(a, 1«) *= j[ j[ A(ic, y) dxdy 

(63) t f f 
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Integratmg the double integral in the braces by parts for y we get, say. 


(64) 


I(t) m f dy f g'{t, T)dT ^\y f p*(«, t) drl 

- ^ yg’it, - y) dy, 
whence, substituting jS — 1 / = t in the last integral, 

l(t) = h f g\t, t) dr — f (jS — T)g\t, T)dT 

(65) 

- f (t + 6 — fi)g\t, t) dr, 

Ja-b 

Proceeding now in the same manner wdth the other double integration in (63), 
we conclude that 


( 66 ) 


mi(a, 5 I s) = 4 / dx f I(t) dt = 4: f (t + a — a)I(t) dt 

Jo J a~x Ja—a 


= 4 /* dt f (^ + a — a)(r + 5 — P)g\U t) dr, 

Ja—n Ja—b 


where, throughout, g{ty r) is defined by (48). 

Formulae for 6 | s), mz{a, h | s) and m 4 (a, h | s) are obtained by a similar 
procedure. They may conveniently be summarized in the following single 
expression. Define a symbol [r] for any real number x by the equations 

[x] = X if r ^ 0 
(67) 

[r] = 0 if X ^ 0. 

With this notation, whatever be the relation between a, 6 , a and / 8 , we have 


m{a, h\s) 4 I f (t + a — a)(T + 6 — / 8 )/l-^ 7 —\ dtdr 

( 68 ) , ., 
+ lonb - ^1= + b^la - af -la- aflb - |8)*)(l - ^) . 

We now allow s to take all values a = 0 , 1, 2, • • • with probabilities P. given 
by the generating function ( 1 ). Then it follows, from the form of ( 68 ), that 

(o,b)=4f I" {t + a-a){r + b - $)^(l-^^^^^)dtdT 

+ {a’‘lb - ff? + b% - af -la- aYlb - 

On subtracting from this the square of the first moment of Y, which by (5) 
and ( 8 ) is 


m 


(69) 
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we obtain the variance a* of Y. But the variance of Y is necessarily equal to the 
variance of X, 

5. Particular cases, (i) ^i{u) = u*. This is the case, considered originally, 
in which the number s of centers of the rectangles p falling within R' is fixed. 
The explicit evaluation of the variance a% depends in this case on the evaluation 
of the integral 

™ L, - .)(r + I. - « {(. - + !}■ 

The evaluation is easy if one expands the binomial under the sign of the integral 
and integrates term by term. Each such integral is a product of two simple 
integrals. 

(ii) ^ 2 (u) = Poisson Case, This is the case where the probabilities 

P» that there are exactly s centers of rectangles p within R' are given by the 
Poisson Law, P, = /«- Substituting the expression of the probability 

generating function into (69), we obtain for this case 

mia, b) = [“ (t + a - a)(r + b /3) X) dtdr 

(71) </[a-a] Jlfi-b] a-0 S\ 

+ - /3J’' + b'^la - af - [o - af[b - 

On performing the integration term by term, and contracting the first term 
of the resulting infinite series into the second line of equation (71), we readily 
obtain the result 


(72) 


m(a, h) = 4e 


2a0\ ^ (Xa/5)" 
tA si 


<xP 

(s + ms + 2)2 


X 

X 


|(fi + 2)a -a + [a - o](l 
|(8 + 2)b - p + [fi - b](l - 


where [x] continues to have the meaning defined by (67). In virtue of equations 
(7) and (8), however, the last term of the expression (72) is precisely the square 
of the first moment of Y when s is Poisson distributed. Hence, for s Poisson 
distributed, we have the expression for the variance of Y and of X, 


2 

OTy 


2 

<rjr 


4 -2a0\ ^ (XQ?^)* __ 

as! (s +1)2(8 + 2)2 


X 

X 


(s + 2)a — (X + [a — o] ^1-^ 

{s + 2)b - + \0 - b](l - 0 


«+l' 


»+l' 


(78) 
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(iii) ~ , Contagious case. This is the case where the prob¬ 

abilities P, that there are exactly a centers of rectangles /o within R* are given by 
the contagious law of type A with two parameters®. The evaluation of the 
second moment of Y is made easy by noticing that the probability gen^ating 
function appropriate to the contagious distribution may be expressed as 
a series in terms of the probability generating function of the Poisson Law 


(74) 


k^O tCl 


= e'” Z 

k^o K\ 




Thus the evaluation of the integi'al intervening in the formula for the second 
moment of Y is reduced in the present case to that of formula (71). 


6. Remarks on other cases, (i) It may be of interest, in amplification of 
H. E. Robbins* results, to exhibit the analogues of formulas (68), (69) and (73) 
in the one-dimensional case. Foi this case, then, if the interval a is embedded 
in a larger interval a', we obtain by similar methods beginning with the calcula- 

da 


(75) m(a | s) = 2 ^ (< + o - a)^l - + [o - a]‘ , 

whence 

(76) m(o) = 2 ^ {t + a — a)^ dt + [a — a]®^ ^1 — ; 


in particular, if s is Poisson distributed, 


2 2 

(Tx = ay = 2e 2 ^ 


(77) 


(aX)** a 

si (s + ])(5 + 2) 


X Ics + 2)a — a + [a - a] ^ 


The <’Jose parallel between these formulas and those for two dimensions make it 
natural to conjecture analogous formulas for n dimensions; but we have not 
attempted to establish such formulas. 

(ii) For the evaluation of the higher moments of Y it may be useful to notice 
that precisely the same method as that described above leads to the conclusion 
that the derivative of the n-th non central moment of F is 


(78) 


d®mn(o, h) 
dadb 


lira I nEiX"-' TT) + n(n - 1 UV)\. 


* J. Nbtman, “On a new claas of contagious distributions”, AnrujiU of Math. Slat., 
Vol. 10 (1939), pp. 36-87. 



ON THE MEASURE OF A RANDOM SET* H 

By H. E. Robbins 

Postgraduate School^ U. 8, Naund Academy 

1. Introductioii. In a recent paper^ the author derived general formulas for 
the moments of the measure of any random set Xy and applied the formulas to 
find the mean and variance of a random sum of intervals on the line. In a 
subsequent paper* J. Bronowski and J. Neyman, using other methods, found the 
variance when X is a random sum of rectangles in the plane, and raised the 
question of finding the variance when X is a random sum of n-dimensional 
intervals in n-space. This will be done in the present paper, independently of 
the work of Bronowski and Neyman, using the methods of (I). The correspond¬ 
ing problem for circles in the plane will also be solved. 

2. n-dimensional intervals, N fixed. Let the random set X be defined as 

follows. Let Ai y Ci (the range of the subscript i throughout this paper will be 
from 1 to n) and 6 be fixed positive numbers such that a*- <2 5. Let R denote the 
n-dimensional interval consisting of all points (xi, • • • , Xn) such that 0 < < 

Ai y and let R' denote the larger interval for which — 6 < a?,- < A,- -|- 5 (and also 
its measure n(A,- -f- 25)). Let a fixed number N of intervals with sides a,- 
parallel to the axes be chosen independently, with the probability density fimc- 
tion for the center of each interval constant and equal to l/R' in R'. The set X 
is the intersection of the set-theoretical sum of the N intervals with JR. The set 
Y consists of those points of R that do not belong to X. We have identically 

(1) X + y = i?, 

where capital letters denote either sets or their measures. 

From (I), equation (15), we have 

(2) E{Y) = *' * j[ > • * * » ^n)dxi ••• dXny 

where, setting r = Ila,, we have 

(3) p(xi, ,Xn) = Pr((xi, • • • , Xn)tY) = - 

Hence 

(4) E(Y) = 

> H. E. Bobbins. the measure of a random set/’ Annals of Math. Stat. Vol. 15 

(1944), pp. 70-74. We shall refer to this paper as (I). 

* J. Bbonowski and j. Nbtman. the variance of a random set.” Annals of 

Math, Stat, Vol. 16 (1945), pp. 330-341. We shall refer to this paper as (BN). 
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Prom (1) it follows that 
(5) EiX) = «|l - (l - 

From (I), equation (21), we have 

/•J*l Mn pAi 

"i I -I 


( 6 ) 


•dxi 


• ,y») 

dxn dyi dtfn, 


where 

( 7 ) p(xi , • • • , ajn , 2/11 • • • , yn) = Pr(ixi , • • • , XnhY and (yi, * • • , 2 /n)<P). 

It is clear from the symmetry of the problem that the distribution of Y will be 
unchanged if we assume that for all i, Xi < yi . Hence, since there are 2^ possible 
sets of n inequalities each, we can write 

( 8 ) E{Y^) = 2' p dx, ■■■ dxndyi--dyn. 

We now introduce the new variables of integration 

(9) Ui = Xi , Vi = Vi ~ Xi 

for which 

(10) d(Ui , • • • ^ Un 9 Vl y • * • » t^n) _ 2 
d(Xi , • • • , Xn 9 yi t * * * t J/n) 

In terms of the new variables we have 


somet, 


( 11 ) p = f(vi , • • • , v«) 


Equation (8) now becomes 


1^1 - ^ if V, > Oi for 

)/, 2 r — n(ai — r«)V -t ^ 

[(1 - 


for off t. 


( 12 ) 


EiY^) = 2" J ... j / .../ / du, ... du,Ai>i • • • *• 


= 2* / • • • / /n(A« — vt) doi ••• dvn. 

Let = inin(ai, At). Then from (11) and (12) we obtain 

£(F’) “ 2 ” jf** ... (l - T n(A« - t>«) d»i ••• 

+ 2 -(i - ... P n(A. - »,) d»i ... <fr« 

-- y ’* ’ j( — Vt) dvt “• dn,| . 


(13) 
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Let the dymbol [a:], as in (BN)y be defined by 

fx if X > 0, 


(14) 


[x] 


0 if X < 0. 


In the integral in the first line of (13) we introduce the new variables of integra¬ 
tion Wi ^ ai — Vi, while in the two integrals in the second line we introduce 
the variables s, = ^4» — t;*. The result is 


E(y’) = 2“ r 

Jlan-An] 

-f 

(■ - " 




+ Ai — ai) dWi 

■ ■ ■ dWn 

(15) 

+ 2 

'-1 

■0-inr-r”-- 

'I 

• • • / USi dsi ' — dSn ? 

[-<*♦»— ®nl ^'(Ai—ai] J 

■ ■ • dSn 

= 2^ r . 

r 

2r - IIw.V 




V~ ft' ) 




•IT(ii), + Ai ~ a,) dwi 

■ ■ ■ dWn 



1 - - n(-4; - [.4, - 

a.f)!. 

From (1) we see that crx = 
and (5) we have 

E(X^) - 

E\X) = EiY^) - E\Y). Thus from (4) 

fOn 

= 2" / 

[<»n~A nl 

r (‘ 

2r — IIw.V 

w~~) 



•U{Wi + Ai — aA dwi • • • dwn 



3. n-dixnensional intervals, N variable. Now let X and Y be defined as before 
except that the number N is taken as a random variable, capable of assuming the 
Values 0, 1, • • • with respective probabilities Po, Pi, • • • , and with generating 
function 

vit) = ^ Pm t". 

0 


( 17 ) 
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Then from (6) we have 

(18) EOT) - 5 p,e{i - (l - i,)”} - «(l - »(l - ^)}, 

while from (15) we have 


vi- = E{Y^) 


^(F) = 2" /““ • • • p .p(l- ^ 


(19) Tl{wt + Ai — at) dni dwn 

+ ^ (i - {nA\ - n(A? - [At - a,p)} - (i - 

In particular, suppose that, as in (BN), N has a Poisson distribution with a 
parameter X, 


( 20 ) 


Py 


{\RT 

ATf ’ 


so that 

( 21 ) 

Then (18) becomes 

( 22 ) 

while (19) l)ecomes 


2 

o'x 


(23) 


...f {t 


E{X) = fill - e"’'”!, 

(xnw.)" 


N\ 


{II(t/;< + Ai — aO} dwi • • - dWn 

+ e~^’' {nA- - n(^ • - [ 24 v - o.]*)} - e~^'. 


Integrating term by term and simplifying the fesulting expression, we obtain 
finally 


<Tx 


= r • 2” • c" 


(24) 


? (aT! 1 


(Xr)* 


{{N + 1){N + 2)1- 

n|()\r + 2 )A, - at + [a* - 


4. Circles in the plane. Let the random set X be defined as follows. Let 
Ai, A 2 , a, and 5 be fixed positive numbers such that 2a < min (Ai , A ^, 2S), 
Let R denote the rectangle consisting of all points (xi, such that 0 < Xi < ili, 
0 < X 2 < ^ 2 , and let R^ denote the larger rectangle for which ~ 5 < Xi < ilx + 5, 
< X 2 < ^2 + 6. Let a fixed number N of circles with radii a and areas 
h = iro? be chosen independently, with the probability density fimction for 
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the center of each circle constant and equal to l/R' in R\ The set X is the 
intersection of the set-theoretical sum of the N circles with R. The set Y con¬ 
sists of those points of R that do not belong to X, Equation (1) holds as before. 
The analogue of (4) is 

(26) E{Y) - p(xi, at,) dx, dx, = - l)", 

while (8) becomes 

(28) E{Y*) = 4 ^ ^ p(x, ,Xi,yi, yt) dxt dxt dyi dyt, 

where 


(27) p(xi ,X 2 ,yi,yt) = Pr((xi, X8)€F and (yi, yt)tY). 


Introducing the new variables (9) we obtain the analogue of (12), 
(28) P(F*) = 4 f(At - vi)(Ai — »i) dvi dvi, 

where, setting r = (wi + vi)*, 


(29) f(v^, V,) 


' I j _ 26 - 2.' an,™ (i) + ^ V4<.' - r"| 


R' 


Introducing polar coordinates r, 6 in the Vi, tJ 2 “plane and carrying out the obvious 
integrations, we obtain 

E(Y^) = ^ o’ (Ai + Ai) -8a*- ibsj 

(30) + 8o’ y {xRt + 4a - 4a{Ai + A^t^) 

— 26 ~ 2a^ arccos t + 2a t's/l -- 

If now iV is a random variable with generating function (17), then (25) becomes 

(31) p(r) =/2^(i-1), 

and hencQ 

(32) £(X) =p|l-, 2(1 - A)|, 
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S(X‘) - El‘iX) « E{Y*) - fi*(r) 

^ (l - + j a\A, + At) -Sa*- 46b| 

- «* + 8o* (rm + 4o’ i* - 4a(.li + At)f) 

/, 2& — 2o* arccoB t + 2o’ ty/l — p' 

-r- 





SAMPLING FROM A CHANGING POPULATION^'^ 

By Reinhold Baer \ / 

University of Illinois 

1. Introduction. If, in sampling a certain population, it is impossible to take 
mor^ than oUe sample at any given time, and if thjB population changes between 
any two samples, then we are confronted withAhe following mathematical situa¬ 
tion. For every® ^ 0 < ^ < 1, there is given a distribution^ (= population) 
D{t), Let furthermore tj be, for 0 < j < n, & number between (j — l)/n and 
j/n; and assume that xj is a sample taken from the population We denote 

by Tn the set of the numbers h ^ y in and by OiTn) the sample consisting of 
the Xj ; and we assume that 0(Tn) is a random sample, i.e. that , • • •, Xn are 
independent variables. The question arises to get information concerning the 
family D{t) from the sample 0(T„). It is clearly hopeless to try for information 
concerning an individual D{t) or even some D(tj) or the statistics that may be 
derived from them. But we may hope for information in the mean, if we assume 
that the family Dit) *is in some sense continuous in t. To make this statement 
more precise we denote by a{t) the average and by Mi{t) the i-th moment of 
Dit) around its average. We assume then that a(0 and M,(0, for i < 8, exist 
and are continuous functions of f, and in section 7 we shall have to assume 
furthermore that ait) and M 2 it) are functions of bounded variation. These 
hypotheses assure the existence of 

the mean average a 

and the mean t-th moment Mi 

for i < 8. Clearly we may hope for information concerning a and Mi from the 
random sample OiTn). It is our object to discuss certain more or less well 
known statistics of the sample OiTn), and to deterniine their stochastic limits*. 

^ Presented to the American Mathematical Society. September 16, 1945. 

* The author is indebted to Dr. E. L. Welker for checking the results, in partioular'those 
rather obnoxious computations needed in sections 6 and 7 which the author did not incor¬ 
porate into this paper. 

* It constitutes a restriction of generality that we consider finite closed intervals only. 
But it is no further loss in generality to use the interval from 0 to 1, and this choice certainly 
simplifies notations. 

^ Comparatively little will be assumed of these distributions. These properties will 
be enumerated in Section 2. 

* See [2] p. 81 and the criterion 2.d. of section 2. 
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As an illustration we mention the following results which will be obtained to the 
course of this investigation (among others):* 

n ' 

£ = ^ xj converges stochastically to the mean average a; 

y-i 

s* = n”^ 2 (^i converges stochastically to ilf 2 + / (u(<) — «)* dt; 

^-1 Jo 

n—1 

d* = ( 2 n)^^ ^ (xj — a:, 4 . 1 )^ converges stochastically to the mean variance Mt: 

It is clear that M 2 is the stochastic limit of 8^ if, and only if, a{t) is constant. 
If a(0 is not constant, then s* is not a consistent estimate’ of ikf 2 , and will have 
to be rejected—at least for large n—^in favor of d* which is always a consistent 
estimate of M 2 . 

It was this last point that led us into this investigation. Recently the sta¬ 
tistic has found much attention; and the question arose as to why the statistic 
8^ should be rejected in favor of Reading the illuminating introduction of the 
fundamental paper [ 1 ], one sees that just such a situation as we have attempted 
to describe hei*e in somewhat abstract terms has necessitated the use of 
Consequently our result may be considered a theoretical justification for this 
procedure. , 

Our other results will be discussed in their interrelation as they are obtainecj.^ 
It should be noted that all our results concern themselves with stochastic con¬ 
vergence, and thus they justify the use of a sample function as an estimate of 
some statistical number only for sufficiently large size n of the sample. Thus 
it is quite possible that for small n other functions provide better estimates. 
The practical applicability of our results depends, therefore, on a criterion for n 
to be sufficiently large, and unfortunately such a criterion is not yet available. 

2. Notations and fundamental properties. We have not stated in the Intro¬ 
duction the hypotheses to which we subject the distributions under considera¬ 
tion. For our investigation we shall need only very few properties of distribu¬ 
tions. Thus we are going to enumerate now some properties of distributioiw 
which we are going to use, and we shall assume throughout that these properties 
are satisfied. As will be seen these hypotheses are rather weak and are satisfied 
by a large class of distributions. 

If X is any stochastic variable, then we denote by E(x) its mathematical ex¬ 
pectation, and the only properties of stochastic variables that concern us are 
properties of their expectations. E{x) is a linear operation satisfying j^(l) = 1 , 

* It should be noted that the stochastic limit of the following statistics would not be 
changed, if we substituted for the denominator n of «* the denominator n — 1 which is often 
used, and if we allowed the summation in the expression for d* to range from 1 to n, defining 
Xn+i as Xi . 

»Wilks |2). p. 188. 
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If furthermore xi, * * * , x» are independent variables, and if the function / 
depends on some of these variaUes whereas g depends only cm the others, then 
E(Jg) « E(f)E{g)f and this property may serve as a definition erf independence. 

As stated in the Introduction we are going to study a family Dit) of distribu¬ 
tions, for 0 < ^ < 1. If r is the stochastic variable of the distribution D{t) 
for some fixed then we let 

a(0 = Eix) and Mi{t) = E((x — a(/))*). 

We shall assume throughout that the average a(t) and the variance M^it) exist 
for every and that a(t) and Mt{t) are continuous functions of L Moreover, 
when discussing M<(t), 1 < t < 4, we shall assume that every M /(t) with j < 
2i is a continuous function of r. Thus we are sure that the mean average a 
and the mean variance , as defined in the Introduction, always exist, and 
the mean t-th moment Mi exists, whenever Miit) is a continuous function of t. 
Remark: If the mean t-th moment Mi exists for every t, then one may be 
tempted to consider as the mean of the family D{t) a distribution D with average 
a and i-th moment Mi , provided such a distribution exists. But this has to be 
done with some caution. For suppose that every D{t) is normal. Then Mi{t) = 
0 for every odd t, implying Mi ^ 0 for odd i so that D would be symmetric. 
Bijt Mti{t) = 1*3 ••• (2i — \)M%{i)* and hence M 2 i = 1*3 ••• (2t — !)• 

MiiJLy dt, and the integral will be the i-th power of M 2 only if M 2 {t) is con¬ 
stant. Thus the mean distribution D of a continuous family of normal distribu¬ 
tions need not be normal. 

As in the Introduction we now let U be some number between (z — l)/n and 
z/n, and denote by Xi a sample taken from the distribution D(^»). We denote 
by Tn the set of the n numbers U and by 0{Tn) the sample consisting of the x,-. 
It will be assumed throughout that 0{Tn) is a random sample, i.e. we shall 
assume that Xi, • • • , x„ are independent variables. 

We are not going to make any use of the customary definition of stochastic 
convergence* (and we shall therefore not restate it). Instead we are going to 
apply throughout the following criterion*’ 

2 .d. The function f{0{Tn)) of the sample 0{Tn) converges stochastically to the 
number r, if 

lim E{f(p{Tn))) ^randhm i^([/(0(Tn)) - ^(/(0(Tn)))]*) = 0. 

All the sample functions considered will be polynomials of the variables 

iTl j * * * f • 

•Wilks [2], p. 81. 

•Wilks [2], Theorem (A), p. 134. 

The validity of criterion 2.d. implies stochastic convergence in the customary sense. 
Thus, all results obtained in the present paper remain valid also when the customary defini¬ 
tion of stochastic convergence is adopted. 
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8« The mean average. Though the discussion of this section is rath^ envious, 
we give the details, since they may serve as a convenient introductiem to the type 
of argum^t we have to use throughout. 

Theobem. 3t converges stochasHcaUy to a. 

n % ' 

Pboof: We note first that E{s!) = n“^ ^ E{xj) = ^ Since ti is 

_ j-i j-i 

between (j — l)/n and j/n, and since is the length of this interval, it follows 
from the continuity of a{t) that 

i.1 n 

/ a (t) dt « lim 53 ® > 

Jo n-**o 

and thus we have shown that E(£) tends to a as n tends to infinity. 

Next we find that 

E({x - E {£)f) = »-* {Xi - a m J) 

= n"’ E mxi - a{ti)f) - n“* £ 

J -1 i-l 

since — a{tj)){Xh — a{th))) = E(xj - a(tj))E{xh — a(th)) = Oforj 9 ^ h. 
But M^it) is, for 0 < ^ < 1, a bounded non-negative function, showing that 
E{(x — E(x))^) tends to 0 as n tends to infinity. Applying 2.d. we find that 
X converges stochastically to a, as we intended to show. 

Remark: It is clear that the speed of the stochastic convergence of j to o de¬ 
pends on two factors: 

(i) the goodness of x as an estimate of E{x ); 

n 

(ii) the speed of convergence of the sums 53 integral a =*= 

j a{t) dt. 

It is this difficulty which expresses itself in (ii) and which makes the present 
type of statistical estimation less effective than thfe one concerned with sampling 
from one distribution only. As to (i), it is again, as may be seen from the proof, 
of the order of magnitude (M 2 /n)^ (see Theorem 1, section 4). 

It is probable that x is a better estimate of E{x) than of a. But this does not 
help, since the former depends on the particular choice of Tn . 

4. The variance. Theorem 1 . d^ converges stochastically to M 2 . 

Proof: We note first that 

E({xj — xy+i)* = E([{xj — a(tj)) ia{tj) — a{tj+i)) -H (o(iy4i) — 

= Mf^tj) ia(tj) — a(t+ Af 2 (<y 4 .i), 

since E{{xj — — o(i/+i))) = E(xj — a(<y))^(x,4.i — a(iy4.i)) = 0, 

JS7(const) «= const and JS?((xy — a(^))®) = M2(ti). Hence 

E{dt) - {2nr\A -h ^ “ C), 
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wher^ A « 2 2 M 2 {ti)y ~ £ (a(^i) — o(^;+i))*, C « Miiti) + M^iQ. • Since 

>-i j-i 

<j is a value between (J — l)/n and^/n, and since n ^ is the length of this interval, 

it follows from the continuity of the functicm M^if) that Mz =« Mj (0 dt » 

lim {2ny^A, Since Mzit) is bounded as a continuous function, it follows that 

( 2 n)^^C tends to 0 as n tends to infinity. Finally we infer from the continuity 
of a{t) —which is used here for the first time to its full extent—that there exists 
to every given positive c an integer N = Nit) such that (a (0 — u(^"))* < « 
for \t' — t" \ < ( 2 iV)“\ Thus for Nit) < n we have (a(<>) — a(<y+i))* < t and 

i2n)''^B < €. Hence i2n)~^B tends to 0 as n tends to infinity, and we have 

2 n 

shown that 

Eid^) tends to Mz as n tends to infinity. 

Next we note that 

Eiid? - Eid^)f) = Eid^) - Eii^f 

= ( 2 n)“* ^ [EiiXi ~ Xi^xfiXi - Xi^xf) - EUXi ~ a;<+i)*)^((xy - Xy+i)*)]. 

But if both i and i + 1 are different from^ and j + 1, then EUxi — Xi^xYiXj — 
Xj^xf) = Eiixi — Xi^\Y)EiiXj — ajy+i)^), and thus there are not more than 3 n 
summands in the above summation that are not identically 0 . These sum¬ 
mands, however, depend only on a(ffc), 3/2(4), 3/3(4) and 3/4(4), and they are 
therefore bounded. Thus EUd^ — Eid^)Y) is equal to ( 2 n)^® times a sum of 
not more than 3 n summands which are bounded. Hence F((d* — F(d*))*) tends 
to 0 , as n tends to infinity. Now our theorem is an immediate consequence of 
the criterion 2 .d. 

Theorem 2. converges stochastically to M 2 + J (u(0 a)^ dt. 

n 

Proof: We note first that n(xy — ^k) = X) “ ^^) therefore 

n 

8 ^ = n"® ixj — Xh)ixj — Xk). Since Xi — Xj = Xi — a( 4 ) + a( 4 ) — 0(4) — 

j-i h,k 

ixj — u( 4 ‘))» we find as usual that 

Eiixj ~ Xk)^) - 3/2(4) + (a(4) ~ 0(4))* + 3/2(4), 
and if h 9^ k we find that 

Eiixj - Xh)ixj - a:*)) = 3 / 2 ( 4 *) + (o(4) ~ a(4))(a(4) - a(4)). 
Consequently 

2] iS?((a;y - Xk)ixj - a;*)) == n®3/2(4) + 2 -^*( 4 ) 

+ 23 (a(<^) - o(<*))(o(<y) - <*(<*)) 

Kk 


= n*M,(<y) + i: + fi: (a(<y) - a«*))T. 

A-1 L*-l J 






Consequently 

E(8^) » n“‘ ]C + n““ S + n"* £ fS («(</) “ • 

j-1 A-1 j-1 L^i J 

As in the proof of Theorem 1 we see that the first of these sums tends to Mt as 
n tends to infinity, and the second of these sums therefore tends to 0 as n tends 
to infinity. The last sum equals 

n"* Y!, [a(<y)’ — o(<y)(o(<*) + o(it)) + a(<*)o(</b)] 

Jthtk 

= n'* Y «(<y)* ~ 2n~* Y o(<y)o(<*) + Y o(t*)a(«i) 

y—l J.A h,b 

- n"‘ Y «(*>•)* - r«"’ «(</)! . 

;-i L ^-1 J 

and this expression tends to J a(0^ dt — j^jf a(t) as n tends to infinity. 
But 

jf' a(tf dt - l^jf' a(t) dtj =- jj' (a(0 ~ o)* dt, 

since ^ — j dt, and thus we have shown that E(8^) tends to 

M 2 + (a(i) — a)* dt SLsn tends to infinity. 

Jo 

If j, h, k, p, q, r are integers between 1 and w, we put 
(j, h, k; p, q, r) = E({Xj - XH)(xi - Xk)iXp - Xg){Xp ~ Xr)) 

— EdXj — Xk){Xj — Xk))E{{Xp — x^ixp — Xr)). 

If neither j, h nor k is equal to any of the three integers p, q, r, it follows from the 
independence of the variables that {j, h,k;p, q^ r) = 0. Thus 

E((8^ ^ Ei8^)f) = E(8^) - Eisy - n-* S'(i, h, k; p, q, r), 

where the summation is taken over all the values of j, h, k, p, q, r between 1 and 
n with the restriction that at least one of the three numbers j, h, k is equal to at 
least one of the three numbers p, q, r. This sum contains therefore not more than 
3 V siunmands, and each of the summands is bounded, since they depend only on 
a(ti)y MoiU), Miiti) and M^iU). Thus E{{8^ — is equal to rT^ times a 

sum of not more than SV summands which are bounded. Hence — 

tends to 0 as n tends to infinity. Now our theorem is an immediate 
consequence of the criterion 2.d. 

Noting that (a(t) — a)^ dt is nothing but the variance of the function 

a(t) (around its mean a)> we obtain the following obvious consequence of Theo¬ 
rems 1 and 2. 
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Corollary: — d* converges atochoatically to the variance of a(t). 

Remarks similar to those made in connection with the proof of the theorem of 
section 3 may be made now in regard to the theorems of this section. 

By similar arguments it is possible to prove that the statistic ^ x^Xi+i 
converges stochastically to Jf a(t)* dL 


6. The third moment. Put d(3) - n (xj — xy+i)*(xy+i — xy+s). Then 

1 

d(3) is a function of the random sample 0(T»). 

Theorem 1: d(3) converges stochastically to Ms. 

Proof: It is readily seen that 

E({Xj — a;y+i)*(xy4.i — Xy+j)) = Ms(<y+l) + (a(<y4.i) — a{tj+2)){M2(tj) 

+ (o(<>) - (a(</) - + MtUi+i)), 

and in practically the same fashion as in the proof of Theorem 1 of section 4 one 
shows now that E(d (3)) tends to Ms as n tends to infinity. 

Furthermore we have 

Ead{3) - S(d(3))’) = £(d(3)*) - E(d{3)f = n-^ E (j, h), 

i*h 

where 

O’, h) == E{(Xj - Xy+i)^(a:y4.i - Xj+2)iXh - Xh+i)\xh+i - Xa+s)) 

- E{(xj - Xy+i)*(Xy+i - Xj+2))Ei(Xk - Xa+i)^(Xa+i - XA4.2)). 

Clearly (J, h) = 0 whenever j + 2 < h or h + 2 < j. Consequently there 
appear actually in the sum of all the 0, iiot more than 5n terms each of which 
is bounded by an absolute constant, since they depend only on a(^), MsC^), 
Ms(/,), MiiU), Mt(ti) and MtiU). From this fact we infer as before that 
E{{d{3) — Eid(3)Y) tends to 0, as n tends to infinity, and our theorem is an 
immediate consequence of the criterion 2.d. 

Remark 1. If MsCO, MsCO and a{t) are constant, it follows from the proof that 

n 

n-2 

and thus (n — 2)“^ S (^i ~ Xj^) is an unbiased estimate of Ms. 

y-i 

Remark 2. One might be tempted to use instead of d(3) the following function: 

«“* £ (xi - x/+i)*. 

7-1 

By an argument of a nature rather similar to the one used in the preceding proof 
one may show, however, that this statistic converges stochastically to 0. 
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Put «(3) * n ^ (xj — Then «(3) is a function of the randoih^ sample 
OiTn). Furthermore let 

F, - a(t)Miit) dt -aMi-a a\t) dt^ + 2a* + a*{t) dt. 

Theorem 2. s(3) converges aiochaaiieaUy to Mz+ Ft . 

Proof: For fixed j, let X(J) = ^ «(^y) + ®(W — and A{j) » 

h-al 

t. (a«y) - o(<*)). Then 

E{a{Z)) - n-^ E E((XU) + A(j))*) 
y-1 

- n-^ E [E(XU)*) + 3A(i)F(X0)’) + 

since E{X(j)) is easily seen to be 0. We find furthermore that 
EiXU)*) - (n - + £raE (o«*) - xh)?) 

= ((n - D* + 

A-l 

E(XUf) = (n - + F(E ia{U) - x,)*) 

Ml 


= ((n - D* - l)M,(«y) + £ Mtih). 

h^l 


Consequently 


F(«(3)) = n-*|^((n - 1)* - n + 1) + 3((n - 1)’ - l)g A0)ilf,(«,) 

+ 3 E 4 (i) E + E ^ (j)*l. 

7-1 Ml 7-1 J 

n 

Since furthermore A{j) = X) (^(^y) — 0 (^ 4 )) * 0, 

7-1 j\h 

E A0-)ilf.a) = n E a(ti)M,(t,) - E o(«*) E 

7-1 7-1 Ml 


g “A ( 7 )’ “ ^ j^na(W - g a(<*)j 

= n* E o(</)* — 3n* E E ®(t*) + 3n E “(<y) F E 0(^)1 

y —1 y*"! Ml 7 ’"»i Lmi j 


«[e «(<»)]. 
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it is easily vmfied that ^(«(3)) tends to Mz + , as n tends to infinity. 

To prove that ^((«(3) — i5?(«(3)))*) tends to 0 as n tends to infinity, one 
proceeds as in the proofs of the preceding theorems, namely by verifying that 
this expectation is times a sum of not more than summands which are 
bounded, since they depend only on a(ti) and on the M„{U) for 1 < m < 7. 
The proof of the theorem may then be completed by appljdng the criterion 2.d 

It is readily seen that Fz vanishes whenever a(i) is constant. But from 

~f" J (o(0 

we infer that Fz vanishes too whenever Mzit) is constant and a(t) is at the same 
time symmetric with regard to a, and more precisely: if M 2 {t) is constant, a 
necessary and sufficient condition for the vanishing of Fz is the vanishing of the 
third moment of the function a{t) around its mean. Thus we see that d!(3) 
is always a consistent statistic for Mz , though s(3) is not. 

6. The fourth moment. The results in this section will be stated without 
proof. Their proofs can be constructed on exactly the same lines as the proofs 
in sections 4 and 5. 

n—1 n—1 

(2n)"‘ 23 i^s - 23 (%-i - Xjf{xj+i - Xj) 

7-1 7-2 


Fz * 3j^j^ dt — 


and 


n—i 

n~' 23 (*/-» ~ - xjf 

7-2 

converge stochastically to Ma + 3 / Mzitf dt. 

Jo 

(xj - Xj+if J converges stochastically to Ma + M\. 

n—2 

(4n)'’^ 23 (^ 7-1 “ ^ 7 )*(^ 7 +i ^*^ 7 + 2 )* converges stochastically to / Mzitf dt. 

,-2 Jo 

From these facts one easily deduces that Ma is the stochastic limit of 

«“T ^ £ (*y - ? S fe-i - 

L ^ 7-1 4 /-2 J 

and that J (Mzit) — ilf*)* dt is the stochastic limit of 

(2n)“‘|^2 (xj - xj+j)* - - £ (aty-i - Xif(xi+i - 


7. Efficiency. If / = f{0{Tn)) is a function of the random sample 0(Tn), 
and if / converges stochastically to a number r, then 


lim nE{(f — r)*) 
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m 


may be conmdered as some sort of a measure for the efficieiicy*^' “ of the statistic 
/ as an estimate of r, provided, of courffls, the limit exists. 

Theorem 1. If the function a{t) is of hounded variation^ then 

lim nE{{£ — af) = Mj. 

«-*oo 

Proof: Clearly 

nE{{i - af) - (*y “ o)J ^ 

= n~* S Mt{tj) + n~* (o(<y) - a) J. 

-A ^ r 

Now 2 J (a(^y) "“ «) = Z- ®(^y) — wa «= a{ti) — n / a{t)dt 
y-l ;-i j-l L •'(j-D/n 


Since a(i) is a continuous function, there exists a number Uj such that 

fJVn 


yj/n ^ 

(i 1)A* < wy < j/n, and / o(<)d< == n ^o(tty). 

•'O-D/n 


Thus 


23 («(<)•) - a) = ^ (o(<y) - o(«,)). 
y-1 ;-i 

But both <y and Uj are between 0 — l)/n and j/n, and a{t) is of boimded varia¬ 
tion. Hence there exists a constant A which depends on a{t) only and not on n 
or Tn such that 

(a(^y) — o)J < A for every choice of Tn. 


The contention of our theorem is a fairly immediate consequence of these facts. 

This theorem and its proof may serve as an additional substantiation of the 
remarks appended to section 3. 

Remark: If we had assumed only the continuity of a{t) instead of its being 
of bounded variation, we could have tried to argue as follows: Since a{t) is con¬ 
tinuous, there exists to every positive number e an integer N(€) such that | a{V) — 
a(/") 1 < € for A' — I < iV’(€)“\ Hence we would find that for N{t) < n 
we have 

(®y - a)J < »**; 

and this inequality is certainly insufficient for proving that the left side of the 
inequality tends to 0 as n tends to infinity, 

I^EOREM 2: If the functions a{t) and Af2(0 are both of hounded variation, then 
lim nE((d^ - Mif) = M*. 


“Wilks [2], p. 184/136. 

“ or a measure for the asymptotic variance of the function /. 
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Proof: In the course of the proof of Theorem 1 of section 4 we have shown 
that E(d^) = {2ny^{A + B — C), where 

^ - 2 i: = £ (0(<y) - a{ti^x)f,C - M,«,) + 

/-I J-1 

Since Mtit) is bounded, it is clear that n"“*C tends to 0 as n t^ds to infinity. 
Since a{t) is of bounded variation, there exists a constant B* such that B < B* 
for every choice of Tn , and hence tT^B tends to 0 as n tends to infinity.^® Fur¬ 
thermore we have 


53 Mxiti) - nAf, 53 Midi) - n / . 

J-1 J-1 L •'(j-D/n J 


Because of the continuity of Mzit) there exist numbers Vj such that 

/•j/n 

U “* 1)/^ < Vj ^ jMy and M 2 {Vj) ^ n I M 2 {t)dt. 

•'(J-1 

Consequently 


^(j-l)/n 


X Mtdi) - 


r [Mxdi) - Midi)]. 

J-1 


But M 2 {t) is a function of bounded variation, and thus we may infer, as in the 
proof of Theorem 1, that n*[(2n)~^4 — tends to 0 as n tends to infinity. 
Combining all the facts we see that n\E{d^) — tends to 0 as n tends to in¬ 
finity, and hence we have shown that n{E{d^) — M 2 ]* tends to 0, as n tends to 
infinity. 

As in the proof of Theorem 1 of section 4 we note next that 

Eid^) - E(iff = (2n)-X 


where (i,i) = E(ixi — Xi+ifixj — Xj+if) — E{{xi — Xi^xy)E{{Xj — Xj+i)"'), 
and that (t, j) — 0, if either f + 1 < j or j + 1 < i. Next we observe that 
(f, j) = E{{Xi — adi) + adi+i) — Xnifixj — a(<>) + ads+\) — Xj-x-if) 

- Edxi — add + o(<i+x) - Xi+iy)E((Xj - adj) + o(<>+i) - af,+i)“) 
+ (add - a(t4+i))(i,jy + (a(<,) - a(tj+i))(i, J)", 

where the expressions (t, j)' and (i, j)" are bounded (by a number independent 
of i, jy n or T). 

Consequently we have 

(t, i) = M4{ti) + QM2{ti)M2{ti^l) + M4(/*+i) — {M2{ti) -f“ M2{ti+l)Y 

+ (o(^•) - a{ti+i))iiy 0* 

= Midi) + + M2{Uf + M2dwf 

- 2(Mtdd - Mtdi+df + (add - a(U+x))(i, i)*, 
where (f, t)* ~ (t, i)' + (i, t)" is bounded by a bound independent of n, Tn . 

A remark similar to the one made just before stating Theorem 2 may be made here and 
below about the indispensability of the hypothesis that a{i) and Afs(0 be of bounded varia¬ 
tion. 
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3S9 


Likewise we find that 

(i, t + 1 ) “ MiiU) *(<<+*) 

{Mt{U) + M^M) + Mt(ti^)) 

+ {a{ti) - a(t,+i)) (t, i + 1 )' + (aiU+t) - a(<,+t)) (i, i + 1 )" 

+ (a(^) — a(ti+i)) (iyi + 1) + (a(<<+i) — a(i<+s))(t, t + 1) 

Hence 


(t, i) + 2(zj t + 1) = + (Miitd 


- - M,(«0) + (a(^) 

- a(<<+i)) (t, *■)■*■ + (a(<.+i) 

- o(<i+2)) (*, i + 1)", 


where (t, i)'*’ 
of i, n, T, 


= (i, iy + (t, iy' + (t, t + 1 )' is bounded by a bound independent 
Considering that 


n -2 


21 (hj) - 2!} (h 0 + 2 22 (^*» + i)> 


it is now deduced from the continuity of the functions a(0, M 2 (t) and Mi(t) that 
n[E(d^) — Eid^y] tends to M 4 , as n tends to infinity. We note finally that 
E((d^ - Ms)') = E{(d^ ~ E(d^)y) + (E(d^) - M,)', and the theorem is an im¬ 
mediate consequence of the facts we have deduced. 

Theorem 3. If the functions a{t) and MsCO ore both of hounded variation, then 

lim nEiis^ - M 2 - f (a(t) - afdtf) 

n-*90 »^0 


= Mt- f Miitfdl + 4 [' (a(,t)Mi(f) - aMt)dt + 4 f Mt(t) (a(0 - a)*dt. 
Jo Jo Jo 


Proof. Since a{t) and MsCO are of boimded variation, we show—as in the 
proofs of the two preceding theorems—that 

n*(n“^ 22 o(^) — CL),n^(n~^a{tjf — f a{tfdt), and 
y—1 y—1 Jo 

n‘(n-" E - Mi) 

1-1 

all tend to 0 , as n tends to infinity. In the proof of Theorem 2 of section 4 we 
computed £?(s'). Using this result we obtain: 

n\E{i) - Mi f {a{t) - afdt) 

Jo 

■= n‘(n-‘ £ Midi) - Mi) + n-‘n-‘ £ Midi) 
y«i - y-i 

-I- n*(n“^ 22 o(<y)' — f a{tfdt) 

y-i •'0 

+ n*(a^ - |^n"‘ ]£ o(ty)1) 
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where one should remember the identity f (a (0 — af dt f a{tY dt — o*. 

Jo Jo 


But 


where the last factor on the right is bounded by a bound independent of n and 
Tn . Hence it follows that 

n^E(s^) — ilf 2 — J (a(t) — af d^ tends to 0, as n tends to infinity. 

By a computation of great length and little interest one shows that 
nE((8^ - E(8^)f) = n"' f (n - 1)* Z + 4n(n - 1) E 

- 4(n ~ 1) i: Ma^y) i: a(fA) + 2ri: 

y-i A-1 L,-.i J 

~ (n" - 2n + 3) E + 4n' ^ M 2 (fy)a(/y)" 


y-i 




— 8 n ^ a(tj) ^ a{th)Mt(ti,) 




+ 4 


L a«y)T E . 

„?-“l J A —1 


It is readily seen that this expression tends to 

Mi +4 f M,«)a«) dt-4M»a- f * dt + 4 [ i\h{t)a(t)* dt 

JO JO JO 


8 a 


f a(t)M2(t) 

JO 


di “h 4a^Jlf2 J 


and now it is clear how to complete the proof of our theorem. 

Corollary 1. If a(t) is constant and M^it) of bounded variation^ then 

lim nE{{8^ - M 2 )*) = M 4 ~ f M 2 {tf dt 

n-*op •'O 

This is an almost immediate consequence of Theorem 3, since a(/) = a, if 
a{t) is constant. 

It has been shown in section 4 that d^ is always a consistent estimate of M 2 
whereas s* is a consistent estimate of M 2 if, and only if, a{i) is constant. Theo¬ 
rem 1 and Corollary 1 offer a basis for comparing the efficiency of these two 
statistics. Since 


0 < Mtiff < Mi{t) for every t 
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(apart from trivial exceptions), we infer from Theorem 1 and C(»rdlary 1 the 
following fact. 

CoROLLABT 2. If a(t) is amstant and Mtit) of bounded variation, (hen 

E{i8*-M^)*)_ i 

“Ze(((P-M,)^) M« ’ 

and this expressim is always positive and smaller than 1. 

Thus we may say roughly that for largo n the estimate s^ of ilf 2 is more efficient 
than the estimate in case both may be used.” We do, however, not offer 
any information of the necessary size of n. Neither do we claim that for small 
n it might not happen that d^ gives a good estimate and s^ a poor one. 
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TESTING THE HOMOGENEITY OF POISSON FREQUENCIES 

By Paul G. Hoel 
University of California at Los Angeles 

1. Introduction. The standard procedure for testing the homogeneity of a 
set of k Poisson frequencies seems to be to apply the Poisson index of dispersion 
to those frequencies. The originators of this procedure [1] pointed out that this 
procedure may be regarded as a test of goodness of fit in which the Poisson 
frequencies constitute observed frequencies corresponding to k cells with equal 
expected values. Somewhat later it was shown [2] that the corresponding like¬ 
lihood ratio test was approximately equivalent to the index of dispersion test. 
Then the problem was approached from the viewpoint of conditional variation 
[3], [4]. This approach permitted exact tests to be studied in some detail for 
small samples. A few years later an exact test for the special case of A; = 2 
was introduced and studied [5]. In this investigation consideration was given for 
the first time to the efficiency of the proposed test. Tables of critical regions 
for the test and tables for computing the power of the test corresponding to 
certain alternatives were made available. 

In spite of the desirable features of this last test, it still possesses certain draw¬ 
backs. First, this test, as well as the others referred to, did not consider the 
problem in which the rate of occurrence of a rare event is constant but for which 
the sampling imits differ in size. For example, these methods were not designed 
to enable one to test whether a factory's accident rate had remained unchanged 
during the past month as compared mth the preceding three months. Second, 
in order to use this test it is necessary to possess the special tables or charts ol 
critical regions constructed for the test. 

In this paper a method which does not require special tables is considered for 
dealing with these more general situations. In the course of the development 
it is shown that this method is, in a certain sense, the best method possible for 
testing the hypothesis of homogeneity against one sided alternatives. Since this 
paper is principally concerned with removing the undesirable features of the 
method advocated in the last mentioned paper, it is advisable to read that paper 
in conjunction with this one. The procedure to be followed here will be to derive 
a imiformly most powerful test, show that it is equivalent to a x'* test, and then 
compare it mth the previously mentioned test. 


2. Similar regions. In the following two sections a study will be made of the 
efficiency of a generalization of the critical region proposed in [5]. For this 
purpose let x and y represent sample frequencies from two independent Poisson 
distributions with means m* and . The probability of obtaining this sample 
is given by 


y) = 


e nix 


e~^^rn!!, 


( 1 ) 


x\ 

362 



POISSON FRBatJBNGIES 


Following the notation and procedure given in [5], let 

Mg 


( 2 ) 


m* + my, p 


nix + niy^ 


n 


x + y. 


Then algebraic manipulation will show that P(a;, y) reduces to 


(3) 


y) 


e 


nl xi(n — a;)! 

The hypothesis which it is desired to test is that 


p*(i ~ 


(4) 


niy 

nix 


where r has been specified. The value of r will often be the ratio of the sizes of 
the two populations imder consideration or the ratio of the time units of the two 
samples. In many situations the alternatives to (4) which are of interest will 
be one-sided. For example, after a factory has instituted a safety campaign, 
it would be of interest to see if the rate was unaffected as against the possibility 
of the rate having decreased; hence the alternatives to (4) would be 


(5) ^ < r. 

m. 

In terms of the parameters introduced in (2), the hypothesis (4) and its alterna¬ 
tives (5) become 

(6) p = ri”r p>rrr- 

Consider the probability given by (3) in much the same manner as was done 
in [5]. This probability depends upon two parameters, m and p, only the latter 
of which is specified by the hypothesis; consequently if critical regions inde¬ 
pendent of M are desired, it vdll be necessary to find similar regions [6] with respect 
to fjL. Since x and y are discrete variables, it is not possible to find similar re¬ 
gions of arbitrary size; consequently it will be necessary to introduce continuous 
approximating functions if such regions are desired and if best critical regions 
are to be found. Toward this end consider the expression for P(x, y) in (3). 
It states that the probability that x and y will take on specified values is the 
Poisson probability that the sample point will fall on the line r + p = n, multi¬ 
plied by the binomial conditional probability that the point will have the specified 
X coordinate when the point is known to lie on this line. If p and n are not small, 
this binomial function could be approximated well by means of a normal function. 
Or, if desired, factorials could be replaced by corresponding gamma functions 
and the nece^ary normalizing factor introduced. Regardless of what con¬ 
tinuous function is chosen, a region on each line a; + y *= n (n = 0, 1, 2, • • *) 
can be selected such that the conditional probability for this approximating 
function is a that a point on that line will lie in that region. Most natural 
approximating functions would become trivial for n = 0; therefore it may be 
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necessary to choose an artificial function for thi^ case or to adopt a convention 
of letting the origin be the critical region for this case but accepting only 100a 
percent of samples for which n — 0 as belonging to this critical region. The 
totality of such a regions will constitute a critical region of size a which is inde¬ 
pendent of M because from (3) the probability of a point lying in this critical region 
would now be given by 

« = a- 

n-O n\ n -0 n\ 


Thus, similar regions with respect to m of size a can be obtained by selecting 
regions of size a on each line x + 2 / = n. 

The preceding method for obtaining similar regions is the only method for 
doing so if such regions are restricted to be found on the lines x + y = n, because 
if a region of size «« were selected on each line a; -f- y = rt, it would be necessary that 


e n 
i ■ n! 


(Xn 


a 


independent of /i. This is equivalent to requiring that 




V ^ Hl.- 

ik a 


but since the power series for (f is unique, it follows that an = a. 


3. Common best critical region. Among these similar regions there will exist 
a best critical region for testing the hypothesis p = po against the single alterna¬ 
tive p = pi if there exist best critical regions on each line x + y = n. From (6) 
it will be observed that this formulation is equivalent to testing the hypothesis 
r = ro against the single alternative r = n . The best critical region [6] on such 
a line, if it exists, will be that region which satisfies the inequality 


po) < u 

fix; P,) - ' 


where / denotes the continuous function selected to approximate the binomial 
distribution on this line and A; is a constant determined so that the probability, 
under the hypothesis p = Po, will be a that a point on this line will lie in this 
region. If the normal approximating function with m = np and = npq is 
used, (7) becomes 


( 8 ) 



(x—npi)* 
npi«i 


(x-npo)« "l 
»poflo ^ k» 


After completing the square in «, it will be found that this inequality reduces to 


(9) 




n(l/qi~l/tto) "12 

l/pofloJ < 




where c is independent of x. 
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If oiCg is a value of x sudi that ; . ; . 

(10) P[a: > ®o I p *» Pol * Of, 

then (9) will hold for a: > a:o provided that pi > po. To demonstrate this fact, 
it is convenient to considra* the three eases po + Pi,^ 1 separately. If po -l" 
Px > 1, 

1 1 „ 11 1 1 ' 

> 0 , 


9i 3o 


j. _ J, _ 

Qi ffo ViQi 


pxQx ' Qx Qo ^ Vx<lx Po^o' 

and therefore x < n < n (- -- ~j/ {Since the coefl&dent 

yix qo// \pxqx PoqoJ 

of the brackets in (9) which involves x is positive, increasing x will reduce the 
left side of (9). If Po + Pi < 1, 

1 1 


and 


Pi 3i Po qo 


n(l/g, - 1/go) 


< 0 


< 0 . 


1/Pi9i - 1/pogo 

Since the coefficient is now negative, increasing x will reduce the left side of (9)- 
Finally, if po + Px = 1, (9) will reduce to 

[*~^] < k. 


Since 1/pi — 1/po < 0, increasing x will decrease the left side of this inequality. 
It therefore follows that the region defined by (10) is a best critical region for 
every alternative of the form pi > po on the line x + y = n. The totality of 
such regions for n > 0, together with the previously mentioned convention for 
n = 0, then constitutes a common best critical region among all possible similar 
regions for testing the hypothesis (4) against the set of alternatives (5). 

In a similar manner it will be found that if the inequality in (10) is reversed, 
the critical region so defined, together with the y convention, will constitute a 
common best critical region for every alternative of the form pi < Po. If the 
alternative hypotheses consist of p po, there will not exist a common best 
critical region using these approximating functions. 

The critical region proposed in [5] is that for the special hypothesis po = i and 
the set of alternatives p 5 ^ po. It will be found that the lower half of this critical 
region for P = 2a will differ little, except for very small samples, from that giv^ 
by (10) for this special case; however, it possesses the disadvantage of being 
numerical and therefore of requiring a special table. The critical region given 
by (10) does not possess this ^sadvantage. This fact will be demonstrated in 
the next section. 


4. Chi-square test. Consider the problem of testing compatibility between 
observed and expected frequencies in two cells. Let x and y represent the ob- 
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served frequencies and and the expected frequencies in a sample of size n. 
If the probability that an observation will fall in the first cell is, as in (6), p « 
1 

then 


and 


6i tip 


X + y 
1 + r 




The chi-square function for testing compatibility then reduces to 


( 11 ) 


M Bi 


(y — rxf 
r{y + x) ■ 



Let xo be the value of x such that P[x > xol = 2a for one degree of freedom. 
With X replaced by xo in (11), this equation determines a parabola in the x, y 
plane. If a; + y == w is not small, the probability of a point on the line a; + y == n 
lying outside of this parabola will be approximately 2a, the accuracy depending 
on the accuracy of the x approximation, and hence the probability of a point 
lying outside of and below this parabola will be approximately a. Thus, a critical 
region for testing p = po against p > po will be given by that part of the positive 
Xj y plane which lies below this parabola. In Figure 1 the lower half of this 
parabola for the special case of po = J is indicated by the s 3 rmbol x*- The critical 
region for the alternatives p < Po would be the region l 3 dng above the upper half 
of this same parabola, while the critical region for the alternatives p 9^ po would 
consist of both of these regions at the 2a level. For one degree of freedom, x 
has a standard normal distribution; ccoisequently the critical region given by 
(11) is the same as that given by (10) in which a normal approximation is used 
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on each line a; + y ** n. lias equivalenee is ea4rily verified hy repladiag ^ by 
n — X and r by q/p in (ll). 


5. Likelihood ratio test. The chi-square test of the preceding section yields 
a common best critical r^on for testing (4) against (6) for the normal approxi¬ 
mation. It is interesting to compare this critical region with that obtained by 
the maximum likelihood principle, which requires no such approximations. 
Consider, therefore, the two dimensional parameter space 

fil Wl>45 ^ 0, 7Hy ^ 0, 

and the subspace 


(a: 


my 

mg 


r. 


Maximizing P in (1) over Q yields nix — x and roy ^ y. Maximizing P over w, 
treating P as a function of m* , yields m, = x + y/l + r. Then the maximum 
likelihood ratio becomes 


max Pw 
max Po 





xlyl 


This reduces to 


( 12 ) 





For a fixed value of X, this equation determines a curve in the x, y plane which 
may be used to determine a critical region. Since —2 log X is known to possess 
an asymptotic chi-square distribution under certain conditions [7], choose as 
critical region that part of the positive x, y plane lying below the curve determined 
by (12) when X has been replaced by Xo, where Xo is determined from — 2 log Xo == 
Xo • This curve may be plotted by reducing it to the parametric form 

log Xo 

X == --- 2/ vx. 

(1 + v) log + r log - 

1 4* r V 


A comparison of the critical regions corresponding to (11), (12), and a slight 
modification of [5] for the special case of po = i and a = .05 is given in the accom¬ 
panying sketch. The modification of [5] consists in choosing xo to be that integer 
which most nearly satisfies (10), rather than to be the smallest integer for which 
the left side of (10) does not exceed a. The latter method of choosing Xo has a 
tendency to make the first type of error considerably smaller than a for small 
values of n. It will be observed that there are no appreciable differences between 
the maximum likelihood and chi-square critical regions. Furthermore, it will 
be found that there are only two values of n, namely n = 3 and n =» 9, forn < 30 
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for whieh the chi-square test and the modification of [5] might yield diffo-ent 
decisions at this significance level. 

The preceding sections show that the chi-square test is highly satisfactory for 
testing the homogeneity of two Poisson frequencies, except possibly for very 
small frequencies, and that therefore special numerical tables are not necessary. 


6. Several Poisson frequencies. The generalisation of (11) for a set of k 
frequencies is, of course, the ordinary chi-square function 


(13) 


X 


V (*< - nptf 
•‘.1 npi 


h 

where n = X x,-, pi is proportional to the sampling unit from which x,* was 

k 

obtained, and The Poisson index of dispersion is merely a special 

case of (13) when p* = l/fc. The adequacy of (13) for this special case has been 
studied elsewhere [3], [8], while studies of (13) in general are nufnerous and well 
known. 
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SOME COMBINATORIAL FORMULAS ON MATHEMATICAL 

EXPECTATION 

By L. C. Hsu 

National Savlhweat Associated University^ Kunming, China 

The main problem considered here may be stated as fdlows: 

Let • • • , fn{x) be n polynomials. It is the purpose of this paper to 
establish formulas concerning the mathematical expectation (probable value) 
of the product 

/l(xi) • • • /«(Xn), 

where xi, , x„ are positive random variables and the sum of these is supposed 

known. 

Before establishing the formulas let us introduce some notations for con¬ 
venience. 

1. Notation. (A) In this paper the notation • • • , a;«) or {m\h\x) 

is used to denote that a set of numbers (xi, • • • , Xn) is over all different composi¬ 
tions of m into n parts with each x ^ k, i.e. over all different integer solutions of 
the equation xi + • • • + Xn = m ^vith each x ^ k, 

(B) Ijet m, d be two positive real numbers. The notation E{m, 5, L/i] • • • [/n]) 
denotes the mathematical expectation of the product /i(xi) • • • /n(xn) in which 
the sum m = Xi + • • • + Xn is known and for every x^Ci^ = 1, • • • , n) the value 
of Xp/S is a positive integer. The notation E{m, 8, [/i] • • • [/n]) thus implies that 
the value of m is a multiple of 8. We call the 8 a ‘Varying unit^\ i.e. the least 
possible difference between two different quantities x* and Xji ^ j. The nota¬ 
tion E{m8, [/]") is merely a special case that denotes the mathematical expecta¬ 
tion of the product /i(xi) • • • fn(xn) imder the known conditions 

/i =••■/« = /, Xi+ ■+X»'=m, 

(r = 1, •• • , n), 

where [ ] represents “integral part of’^ 

(C) In order to simplify our formulas we always denote/(x) by/^*^/,j + ... 
+ fp, by ... and l.pi + • • • + k,pk by a(p) or a. It is a convention that 

= 0 for m < n. 


2. Lemmas. Lkmma 1. Let m, n, ••• ,rn be non-negative integers. Then 

) ys A /xA m + n~l \ 

^ («.) \rj Vri + . •. + r. + n ~ i;- 



37C 


L. C. HSU 


Pboof: The lenrnia follows immediately by considering the coeflScient of the 
term * on both sides of 


+rn+n 


ir^r-iT^r<T^r 

Lemma 2. Let a, b, c, ••• be any constants, and ki, h, h, • •• any 
integers. Then 


( 2 ) 




nl 


E (. 


- 1 w 

• • • + n — 1/ a! 


OT b^ 


m + n 

+ 0k2 + yks + + w— 1 / al fil y I 

Proof: Expanding the left-hand side of (2) we see that the coefftcient of the 
term a%^c'* • • • is equal to 


nl 




a!iS!7! 

By Lemma 1 it becomes 


nl / 
a! iS 17 ! \aki 


m + 

+ Pk 2 + yh 


h J \ h J 

n - 1 \ 

I + * • • + n — 1/ 


Hence the lemma. 

Lemma 3. Let m, n{ ^ m) be two positive integers. Then, for any given poly¬ 
nomial f(x) of the kth degree, we have 


(3) 


E /(»l) • • • f(Xn) = nl 


/w + n—l\A [(/ — 
(^ 3 >) \<r + n — 1 / pyl 


where f^""^ = fix), <r = <7(p) = l.pi + • • • + kpk , 

Proof: Since /(x) is a polynomial of the kth degree, there exist (k + 1) values 
Pk, • * fio such that 



= fi^)‘ 


By putting a; = 0,1, • • • , A;, it is orderly determined that 

ft=+ •■• + (-!)’(;)/”' = (/-1)‘'\ (V=0,1, •.•, k). 


'The lemma is thus obtained by (2). 

For convenience we denote the summation X) 1; /i(^i) * * * fnixn) by 

Sim, [fi] • • • Lfn]). Thus the formula (3) can be written as 


Sim, [fD = n! 


y /m + n-l\fr [(/-irr 

(n;0;p) n 1/ traiO PpI 
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Lemma 4, Leifi{x)f • ^ • yfnix) ben given polynomiah. Then. 

(4) S{m, l/J • • • [/„]) - i E i-l)"-”S{m, [/„+•••+ /J”), 

n! 

l^k^n 

where (n ^ ^ Vk) runs over all different combinations out of {I ••• n)y k ^ ly 
• • • , n. 

Proof: The proof depends essentially on the formal logic theorem. Con¬ 
sidering a typical term 

nl 

. , S{m, [/.,]•* • • • [/„]'"), 1 < < < n, 9i +•••+?.= n, 

Qii • * * Qt'- 

we see that it is contained in the last (n — t + 1) summations of the righthand 

side of (4), i.e. in the summations (n • • • Vk) ask = tyt+ 1, • • • , n. The num¬ 

ber of occurrences of the term in the right-hand side of (4) is therefore 

— A _ 0 if t > n 

V / ~ 1 if t — n. 

The term vanishes generally except when qi = • • • = = 1. Hence the right- 

hand side gives 

S{my [/il •••[/«]). 



3. Theorems with formulas. In the following statements of theorems and 
corollaries, the notation {xi • • • Xn) is always to denote a set of undetermined 
quantities, though the kind of the quantities of the set is stated. 

Theorem 1 . Let (xi • • • Xn) be a set of natural numbers under a known condition 
xi+ • • • + Xn = m. Thetiyfor any given polynomial f{x) of the kth degree, we have 


(5) 


E{m, 1, I/D 


/m + n — [(/ — 1)T* ' 

/m - l\ („X^) \<r + n - 1/M pA 
\n - \) 


Proof: lA?t w' = w + nr. By lemma 1 we then have 


V ^ _ /m^ — nr + n 

irk.) \0)'\0)^ V n- 1 



This is the number of compositions of m' into n parts with each part ^ r. In 
particular, for r == 1 we see that the number of compositions of m into n parts is 

Thus by the definition of mathematical expectation, the required 

value is equal to 



I/]”) ; . iV 

\n~l/ 

The theorem is therefore proved by Lemma 3. 


Sim, m. 
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CoROLLABY 1. Let (xi • • • Xf) be a Set of positive quantiHea^ of which the vary¬ 
ing unit is and the sum is m. Then, for any given polynomial f(x) of the kth 
degree, we have 



where 


g(x) = f(8x), a lpi + •-+ kpk. 


Proof: It is deduced by the relation E(m, 8, [/(a:)]") = E(m/8,1, [/(^x)"). 
Corollary 2. Let (xi • • • Xn) he a set of rwn-negative real numbers under a 
known condition Xi + • • • + Xn = m. Then, for any given polynomial /(x) = 
(Jq + ... we have 


(7) 

where 


E{m, 0, [fV) 


^ ini? 


(n^q) (<r + n — 1)1 


qol 


(kla,r^ 

QJkt ’ 


a* 0, <r = aiq) = + • • • + . 


Proof: The proof of the corollary depends essentially on the concept that two 
different real numbers may differ by an arbitrarily small number h. 

Let i*! be an arbitrary positive number and let fixh) = h^gix, h), where the 
number k is the degree of /(x). Then, since 


n 


y-O 



(n - .)” 


we may VTite 



jf p > n 

if p = n 

if p = n + 1, 


g (-1)* g(.p - s, A) = rt!o, + h-R,{h)], 

where lim R„(h) 



J'lOr+i . 


Now we pass to the limit h—^0,m which it is assumed that h runs through a se¬ 
quence of rational numbers of the form l/N, Thus by Corollary 2 we have 


lim Eim, h, [/]^) 


s rTT-r-mn 

(n; 0 ;p) (o -r n — 1)! F-0 


jpla^r 

Vyl 


Hence the corollary. 

It may be noted that this corollary can also be independently deduced by the 
proportion of the two integrals: 


J ’ * * J f * * * f (^»*) * * * dXn—l : f * * f 
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where the integrals are all taken over the region R: xi + • • • + » m, xt > 

0, • • rn > 0. 

Corollary 3. Let (ri • • • Xn) he a set of posiiive real numbers under a known 
condition a < Xi + - ’• + Xn < h, where a, h are non-negative numbers. Then^ 
for any given polynomial f(x) = a, + * • • + (ak 9 ^ 0 ), the mathematical ex¬ 
pectation of the product f(xi) • • • f(xn), which we denote by E((ab)y 0, [/]**), is given 
by the formula 


E(ay b)y 0, m 


nl(n — 1)! 


(n;o;fl) (1 + <r(g)) • (n — 1 + 

Proof: Since the required mathematical expectation is the mean 


^[e(u, oAfDdu, 


(fc!o*)" 

qhl 


Corollary 3 follows from Corollary 2. 

On the other hand we see that 

lim E(ay a + h)y 0, [/]") - E(ay 0, [/]"). 

Hence Corollary 2 can also be deduced from Corollary 3. 

Theorem 2. (First generalization of Theorem 1). Let /i(r), • • • /n(x) he n 
given polynomials, of which the highest degree is k. Then we have 


£(m, 1, [/,].••[/,]) = r Z (-1)"”' 

(vi-.-v,) (n;o;p) 

n 


( w + n — l\ 
a + n - l) 


[(/m.- 


where 

Proof: In the proof of theorem 1 we have seen that 


Eim, 1, [/]”) = _ J) ' S(m, [/]"). 


Thus, by similai* reasoning and lemma 4, we have 

£(»n, 1,(/,]•••[/„]) = L «(»»,[/,..1"). 


(I'l*••»'.) 
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The theorem is proved by lemma 3. 

CoROLLART 1 . Let B be a varying unit Then 

i?(m,«, [/a] •■•[/»]) =. D ■ E (-1)— 

(m•••»#) (»;0;p) 



where 


+ • • • + a,. . 

Peoof: By the relation E(rn, i, f/i (a;)] • • • [/»(*)]) = E (m/d, 1, (/i(4a:)] • • • 
{/»(5a:)]) we obtain the corollary. 

CoROiiLABY 2. For any positive real number m, we have 


(11) E(m, 0. [X-*] .. • [X-*]) = . ni 

(Pi + • • • + Pn + n — 1)1 

Proof: Since E{m, By [/i] • • • [/n]) = ^ ( —l)”'VnI E(rn, B[fv^ • • • J"), wehave> 
by letting 6 0, 


Pl+---+Pn 




(~1)" 


ni 


E{m, 0, [/f,... J”). 


The corollary is therefore deduced by (7). 

Theorem 3. (Second generalization of Theorem 1). Let (xi • • • Xn) he a set 
of integers under knovm conditions Xi + • • • + Xn = wi, a < a;,* < 6, where a, h 
are given integers. Then, for any given polynomial f{x), the mathematical expecta¬ 
tion of the product f{xi) • • • /(a^n), denoted by E (m, 1, [f^, is given hy the formula 

iab) 

(12) E (m,l, [/]") = ^ 

pwmO 

where 

g{x) = f{h + x), h{x) = f{a + X — 1) and m' — m — (a — l)n + (a — 6 — l)p. 

Proof: Define S{m, [/]”) = 0 for m < n, and S(m, \ff) = 1 for m ~ 0’ 
shall now prove that 

t (- 1 )' (”) mh]-) = E f(Xl) ■ ■ • fiXn), 


Sim', [ffl'lfe]"-') 
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8T« 

where on the right-hand side of the expression the set (xi, • • * madder the 
summation runs over all different compositions of m into n parts and 


^ X, < h, =* 1, • • • , n. 

For convenience we denote the left-hand side of the expression by that is, 
= £ (-1)' (”) 2 S(,m, [f(x + b)Y) S(m' - m, [/(z + a - 1)]""')- 

\V/ 

Let f(£i) • • • f(xn) be a product term contained in i.e., fi + • • • + :£» * m; 

^ o, • • * , S u. We assume that > 6 + 1, • • • , > h + 1, where 

vi ^ Vj Hi ^ j. Then it is seen that the number of occurrences of the product 
term in 0 is given by 


rc-D* 


4-0 



if t>l 
if < =» 0. 


Thus the product term f(xi) • • • f(xn) of © vanishes except when 
u < a;, < 6, V == 1, • • • , n. 


Hence we have 

® = 23 f(Xl) f(x„). 

a^x^b 


Next, we shall find the number of different compositions of m into n parts with 
each a < Xp < hf i.e., the number of product terms of ©. By the above result 
we see that the number is given by 


SE (-1)' 


pwmO m^rnP 



i: 1 E 1 



Hence the theorem. ^ 

This theorem shows that the mathematical expectation E (m, 1, [/]**) can be 

iab) 

expressed by S(m[gY) and is therefore expressible in terms of linear combinations 
of the coefficients of the polynomial f(x). 

Corollary 1 . Let She a varying unit for which ^ y- are all integers. Then 

6 0 0 


E {m, S, [/(*)]") = E , 1 , [f(Sx)r ). 

(Ob) ((o/«).(b/«)) / 

Corollary 2. Let /i(r), • • • fn{x) he n given polynomials. Then 


E (m, l,[/i]...[/n]) 

(Ob) 


(M —•'.) 




E (m, 1, 

(o,b) 
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CoROXiLABT 3. The number of integral eolutiom of the equation Xi+ • • • +Xn ^ 
m vjith ai < xt < hi y On < Xn < hn is equal to 

n — (ai + • • • + On) + (ai — 6i — 1)>»1 + • • * + (On — ?>n — l)vn — 


Proof: We have shown that the number of integral solutions of the equation 
Xi + • • • + Xn = m with o < < 6 is given by 

^ — (a — l)n + (a — 6 ~ l)i/ — 1 



Hence the number of integral solutions of the equation Xn + • • • + a:i„i + 
• • • + x»i + • • • + Xtn, = m with Oy < < by i (v — 1 • • • s, /x = 1, • • • n„), 

is given by 


s 


•• E 




m. — (a. — l)n< + (a< 
n, — 1 


— hi — l)v, — 



, 1 - 0....,.,-0 \vi/ \Vm/ 


( m — (ai — l)ni — • * • — (o« — l)n, 

+ (oi — 6 i — 1)1^1 + • • * +(«•“&#“ l)vs — 1 
+ • • • + ^» ~ 1 

The corollary follows at once by putting ni= ••• ==n, = l,s = n 
This corollary can be restated in a more interesting manner as follows: 

Let there be n store rooms, and let hi, ••• ,hn he the numbers of stocks con¬ 
tained in 1st, 2nd, • • • , n-th storerooms respectively. Then m stocks contain¬ 
ing at least a< stocks of the t-th storeroom (i = 1, • • • , n) can be chosen from 
these n storerooms in 


,1-0,...,!.,-^ \ 


m + n + (ai — 61 — l)»'i +•• 

+ (On — hn — l>n — 


an — 1 


n — 1 


different ways. 

So far we have established several combinatorial formulas concerning the 
mathematical expectation of the product fi(xi) • • • fn{xn) under certain con¬ 
ditions. In the next section, we shall explain how to apply these formulas. 


4. Applications, (a) A criterion. In order to make the above formulas 
applicable to practical problems we state a criterion as follows: The mathemati- 
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cal expectation of a function F(xi, • • • , Xn) can be estimated by tbe 
combinatorial formulas if and only if the sum of these undetermined qiwitities 
xi, • • • , Xn is known and there exist n polynomials /i(x), * • • , /n(a?) su<*l tiiat 
F «/i, • * * , F oc/„, where the quantities xi, • • • , Xn may or may not be conti¬ 
nuous. When the quantities are discontinuous, the varying unit is certainly 
given. 

(b) Some approximations. For/(x) = i5o + * • • + ^ 0) we may write 


where 5,.. is a Stiiiing number of the second kind, as used by Jordan, and de¬ 
fined by 

Xa-O \^/ 

Thus, the formulas (5) aud (9) can be written as follows: 


(S') 


(9') 


Bijn, 1, [/]") 


V (»» + ra — 1)1 (wt — n)l n! (n — 1) 1 
(.■fe) (m — (r)!((r + n — l)!(Ttt — 1)! 

. ^ (<?> 5>,r + • • • 4- fa S>.>)*’ ‘’ 

Pi* 1 


1, [/il ••■[/,)) = E E (-1)"-* 

(•'I**!'*) (n;0;p) 

, (m + n - l)!(m - n)!n!(n - 1)! A (B, 5r.r + * - + 

(m — <r)!(<r + n — l)!(w — 1)1 Pw\ * 


where 


Sr# = I'l^r# , /< = + * • ' + /3<jbX^ -Bi = Pu + - + Pni • 

Now we state some convenient formulas concerning the number S#,. 

If m is sufficiently large and t is smaller than the following recurrence rela¬ 
tion is useful: 


(13) + ■■■ 

- (7+*l) (<*4^2*) 

+ ••• + [(2f — l)Xi-2 + 

where X, sa 1, s 0 and Xi, * • • , Xf-a are all independent of m. 
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Stiurting from the first equality and using the recurrence rdation Sm.n^i ^ 
mSm,n + 8m^i,n succcssively we have 

m 

- §^{§ {::!+:>U7 
" §^{C+/+2)^'+^'++C+/+1)^' 

- g [« + + (1 + j)Xyl(j 1 ) ’ 

where X-i = Xt-i = 0. The recurrence relation is thus deduced. 

Writing 

= ( 7 + 1 ) + ’^‘(7 + 2 ) + • • • + 2t 0 ’ 

and using the recurrence relation as obtained above, the coefficients Xi, • • • , X/«i 
may be exhibited as follows: 


t 

Xi 

Xj 

Xj 

x. 

Xs 

Xe 

Xt 

1 

2 

3 







3 

10 

5 






4 

25 

105 

105 





5 

56 

490 

1260 

945 




6 

119 

1918 

9450 

17325 

10395 



7 

246 

6825 

56980 

190575 

270270 

135135 


8 

501 

22935 

302995 

1636635 

4099095 

4729725 

2027025 

9 

1012 

74316 

1487200 

12122110 

47507460 

94594500 

91891800 


134459425 


Now let 

s„,n+,=[(” ^ 7) + (r+ 2) + •+(” i *)] ”• • 

The recurrence relation obtained above gives 
= (2< - 1)X^,(< - 1) 

X,_,(<) = 2(< - -!) + (<- l)X,_,(t - 1). 


X(-i (f) 


mi 

il2‘ ' 


X,-s«) = (« - 1)! E 2'-""^ V 

r-l 



Thus we obtain 
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Since the orders ^ > • * * »^ are all less than 2t aa n 


and since 


(”+‘)x„«)-i(0n(i+^) 

-hm-m-s) 

(” t;) x„«) - {"i ') »-»' 


We may write (by Stirling's formula) 




< + 4 ‘' 0 ‘) e(t) + u^ 


where Cn —> 0 as w —> oo. 

Now it is easily proved that the inequality 

holds fo^ every positive integer x. We have, therefore, 
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aad 


Using these inequalities we have 

+ 2)» <4‘Vo < l\/tzri “ «<> 

where it may be noted that 


lim ~ = 1. 


Hence we have in conclusion 


(14) g (-1)" (:)- (i)' (0 ‘+- ). 


where 


3 ^ ^ Vt d 


vS - »■ 

Evideatly the formula (14) implies (16) and (16): 

ic-ir'Q*" 




(15) 


(16) 




nl 


t = 0(n^-), 


> 0 . 


(Stirling's formula). 
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ON THE CONSnrUBNT ITEMS OP THE SEDOGTKHr Um IHS 
REMAINDER IN THE METHOD OF LEAST SQUARES 

Bt S. Vajda 
London 

1. Consider a set of variates , (i =» 1, 2, • • • , n), winch are norxnally and 
independently distributed with variance 1. Let also a matrix (af«) with i = 
1,2, • • • , n; A; = 1,2, • * • , « and rank 8 be given. Find hi, • • • , h« in terms of 
so that 

(Vi 

i h 

is a minimum. This minimum value shall be denoted by . 

It is known (see e.g. R. A. Fisher, “Applications of Student's distribution", 

Metron Vol. 5, Part 3 (1925)) that varies as does x* with n — « degrees of 

freedom and that it is possible to express as the sum of n — s squares of 

linear fimctions of the y,. In the following lines ^ will be expressed as the 

• 

sum of n squares of such functions which are independent and of variance 1. 

The siun of the first« squares will equal S yand therefore the remaining 

% 

n — s squares equal . 

Thus a simple way will be found of writing down explicitly the linear functions, 
whose existence only was proved by Professor Fisher in Metron, 

2. We first calculate . 

^ # 2 

= 0, for Z = 1,2, • • • , s, gives the normal equations 

dOf 

n n a 

(1) Y^XuVi ='£,'^XaXikht , 

which can be written 

(2) ^ ^itVi ““ ^ Xikbk 
with 


X.* = E 


XiiXik , 


It follows from (1) that 

(A) ^min *= 2 y23 S S XiiXikhhk “ 23 y* ~ 23 £ Xikhibk , 

,..1 ,-1 M {-1 2 *.l Jb .1 

where the b are solutions of (1). 
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3« A second expression for can be found as follows: 

Introducing 

C< - 2 

Jb-1 

we obtain from (1) 

n * n 

(3) 23 a:<ic< = 22 ^tiVi I (Z - 1, 2, • • • «). 

i-1 •-! 

Now if , (w = « + 1, • • • n), are any n — «independent solutions of 

n 

2-/ “0, (i ■■ 1, 2, • • • , s)| 

$-1 


then the Ci satisfy also 

n 

(4) 23 * 0, (u « s + 1, • • • n). 

Let such a set of Ziu be chosen. Then (3) will be solved by 

(5) 

with Xv as indefinite factors and these d satisfy (4), if 


Ci ^ Vi — ^ \vZiv 


( 6 ) 


23*<»y<= 23 (m = «'+1, • • • n), or 23*<»y< 

*'•■1 w—e+l t—J t«»l 


^ Zuv\f 
»—«+l 


with 


= 23 


Zi^Ziv . 


Because of (2) tht/ equation (A) can be transformed into 

^min = 23 £ 23 XitVih = 2 S'* - 23 = 2) 2 

<«.! 1-1 M i-l <-l »-,+! 

which is, because of (6) 


(B) V'min = S £ ^ii»XuX», 

where the X are solutions of (6). 

The comparison of (A) and (B) gives 


23 V* - Z 23 Xuhbk + 23 23 Zu.kk 

im.1 2-1 Ic-l V-M+l 
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where the first form on the r.h.s. shows the reduction of 2 tA method of 
least squares and the second form constitutes the remainder. 

4 . These two forms must now be expressed in terms of the yi . 

We introduce the notations 

X11X12 Xn -Xu 

XnXn X.i ••• X„ 


y 5r(#+2) __ 


^a+l«+2 
^«+2a+l Z,^2 *4-2 


It is well known (and can easily be verified) that 

t, t, Xikhb, = ^T) (Xn6i + • • • + X,.b.)‘ 

I-l jb-1 -A 


+ X«> 




^2 + • • • + 


Y 

r2.x,.rv 


"T • • • "b j^(«) ^ 


which may be wTitten 


f(i) Xikbi^ 


1 


Xu ZXub* 

A;-.l 

Xu i:xstb, 


xuXu • • • i: Xi*bt 


X'rri) 


X..X.J • • • E x.*bt 

ik»i 


Using (2), this can be expressed in terms of the yi instead of hk as follows: 


(gxul/*) +J^ 


n |2 

Xu 

t-1 

n 

X 21 '^Xi2yi 


1 ^ 

XuXi2 • • * Xiiyi 

t-i 




X 41 Xai • • 53 2 
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Similarly (6) the second form can be transformed into 


(8) ^2.1. - 


1 




+ ■ 


r(»—1) rwin) 


n |2 



^ftt+1 ^n8+2 


Z 2<n Vi 


The rank of (xik) is s, so that the order of the suffices can always be chosen 
so as to make the above denominators different from zero. 

Thus both the reduction and the remainder have been expressed by sums of 
squares, whose numbers correspond to the “degrees of freedom’^ 8 and n ~ s 
respectively. 


5. It remains to be shown that the linear fimctions of the yi appearing in each 
fonp are mutually orthogonal and that in every one of them the sums of the 
squares of the coefficients are unity. 

n 

Now if we call the n linear forms which occur above 2 = I» 2, • • • , n), 

7-1 


then our proof implies that. 



n r n n 2 

23 = 

•-1 Lj -1 J 


n n n 


aijai^ViVk . 

7-1 jfc-1 


This is an identity for any yi , hence we must have 

n 

2 ^ik = 1 if i = and 
1-1 

= 0 if j 9^ k. 

We have thus shown that the matrix (a,y) is orthogonal and it follows that 

n 

23 = 1 if 3 = k and 


«-l 

= 0 if j 9^ k. 

6. In practical applications the Xik will be given and if the expression (7) or 
(8) is to be written down we must first solve the set of equations 


We may assume that 


n 


23«<» 


Xu = 


0 . 


IXu X ,1 


9^0. 


a = 1,2, •••,«). 


I Xu • * • Xat 



IiEA 8 t SQUABIDS ^5 

There exist, of course, an infinity of solutions. A very simple one oim be found 
if the matrix {xik) is completed into a square matrix by adding 1 in the diagonal 
places and 0 elsewhere. We obtain 


Xn * 

• Xn 

X.+11 

* * Xn« 

Xu * 

• x« 

X,+1, 

* * Xna 

0 • 

• • 0 

1 

•• 0 

0 • 

•• 0 

0 

• • 1 


The minors of the terms of any of the « + 1th, • • • nth line give one of n — a 
independent sets of solutions for the Zi-u . 

If, e.g. s = 1, then the Ziu are 

—X 21 Xii 0 0 • • ‘ 

— X 31 0 Xn 0 • * • 

-X 41 0 0 Xn • • • 


and the Z are 


etc. 


^2 ^2 

Xn i- X 21 , 
X 21 X 31 , 
X21X41, 


X 21 X 81 , 

^11 + Xzi , 
X 81 X 41 , 


X 21 X 41 
X 81 X 41 
^11 “T ^41 


etc. 


Hence, for s = 1, n = 2, 

n 2 . 1 

lAmin = § - a;*, + xlt ^ Xlt + X|i ^ 


and for s = 1, n = 3 

.2 2 (xnyi + XtiVz + XzxyzY 

ymin = Z^yi , 2 , ^2 

*-1 ^11 + X 2 I + ^81 


xh 4 * xli — X21I/1 + Xiiy2 


1 , . ,2 . ^1^81 - xziyi + xiiyz 

j~z:j (—W” 

Xn 4 X21 ^^2^ ^2^^ Xn X21 X21X81 


X 21 X 81 xh + xh 






1 1 1 1 •••• 2 1 
1 1 1 1 •••• 1 2 


and 



The sum of squares into which can be transformed is then found to be 

(- yi + ytf + ^{.-yx- yi + ^y^f 

+ yi — y» — y» + ^vt)* + • • ••* 


^ This is the result contained in a paper by J. O. Irwin, Independence of the constit* 
uent items in the analysis of variance” SuppU Roy. Slat, Soc. Jour. Vol. 1 (1934). 



NOTES 

TkU section is devoted to brief research and exposUory articles, notes on 
methodology and other short items. 


ON THE ANALYSIS OF A CERTAIN SIX-BY>SIX FOUR-GROUP 
LATTICE DESIGN USING THE RECOVERY OF 
INTER-BLOCK INFORMATION 

By Boyd Habshbarger^ 

Virginia Agricultural Experiment Station 

1. Introduction. A detailed description for a six-by-six four-group lattice 
design is given in a recent article [1] by the author, and the analysis is developed 
which uses only the intra-block information to correct the varieties for the block 
effects. Here is developed the analysis that makes use of both the intra- and the 
inter-block information. 

Referring to Group X on page 307, fl], since block (1) contains varieties 1 to 6, 
and block (2) contains varieties 7 to 12, the difference between the means of 
these two blocks is also an estimate of the difference between the first six varieties 
and the second six varieties. The information obtained from such inter-block 
comparisons was ignored in the previous analysis. In attempting to use this 
information, the chief difficulty is to decide how estimates derived from the 
comparison of block totals shall be combined with the previous estimates. 
Since ea(;h block consists of six plots, comparisons between block totals may be 
expected to have a higher error variance^than the within-block comparisons, 
just as in split-plot designs the main block comparisons usually have a higher 
error than the sub-plot comparisons. The problem is, therefore, to estimate 
the relative error variances of the inter- and intra-block comparisons, and then 
to combine the two types of estimates to the best advantage. 

2. Calculations of the adjusted varietal totals.' In addition to the equations 
(7), [1], which contain all the intra-block information, we now have the additional 
set of equations, 

Bi == 6/i + (sum varietal constants in this block) + c,, which are estimated 

by 

Bi = 6m + Zvbi -+* Ei . 

In these equations and all the following equations, the double prime symbol 
(") used in [1] is omitted, but the statistics have the same meaning as in equations 
(7), [1] except in this paper they are adjusted by both inter- and intra-block 
information. 

1 The author wishes to express his appreciation to W. G. Cochran of Iowa State College, 
who advised in the preparation of this analysis. 
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The general problem in to minimize the function, 


F = WS(.yij - m - vj - hiY + ^S(Bi - &m - 

" D 

S6 tt h j 

subject to the restriction X) = 0 and = 0, and where W = -rand 

j*»l e^x t»l (T 

W = \. 

<Th 

Following the method given in (1) the typical block equations for • • • 6x« is 
“ 6 3W + W ~ “■ 6 3FT1P' 


and for h^i • • • 6«6 is 


= i4ii(r+-W0(3l^-Tr) + 22 TFTF' + 

+ (W- + C^)] + (Cn, + C„4 + C^)|. 

It can be seen that for W' = 0, hxi and thi are the intra-block values given in 
[1] and for IF' = IF they are the randomized block values. 

A typical adjustment varietal total then becomes 

4:Vi + 4m =s Fi — (bzi + hyi + ^*i + ^u 2 ). 

3. Estimation of IF and TF'. Following the method presented by Cochran [6] 
and Yates [3], the error of a block total may be witten as 

Ei = Cii -)- c,-2 + * • ’ + c,-0 -f- bfc* 

where 

Vie) = (7“ and F(60 = <rl 

Hence V (A\) = 6<7^ -h 36<r6 and component (a) is thus an estimate of -f Ga?. 
One finds from evaluating the expected value of (15), [I] corrected for replicates, 




that the expected value of component (h) is + }*G(r6. 


In the analysis of variance if components (a) and (b) are pooled, one obtains the 
block variance 5 as an estimate of + {• Ga?. Since the intra-block variance 
is an estimate of the estimates of the true variance between* blocks, + Gcr? , 
. SB-E 1 

IS 7 • 

4. Stwdard error of adjusted varietal means. The standard error of the 
difference between the adjusted means of two varieties which appear together in 
the same blocks in groups Z or I/, is 

1 r.,. , sw 1 


kW 


3W + W'. 



A LATTICE DE^GN 


^9 


obtained bjr the method outlined by Cochran. Similarly, for the case in which 
the varieties are together in the same block in groups Z or U, 

When an attempt is made to express the difference between these two adjusted 
varieties which appear together in the same block in groups X or F in terms of 
the levels of the main effects and interactions, the interactions are no longer 
unconfounded and the method employed above breaks down.. 

If one is willing to assume that the formula for the variance of the difference 
between two adjusted varietal means for varieties which appear together in the 

1 / BW \ 

same block in the groups X or Y is of the form 24 ^^!^ + SW + W j 

constants may be determined by the values already known, [ 1 ]. This form can 
be shown to be that for a quadruple lattice. 

1 / BW \ 

The formula (^4 -f ^ reduce to the value for intra-block 

analysis [1] when W' = 0, and when W = IT' to the value for complete random¬ 
ized blocks. When these conditions are imposed, the formula becomes 

1 , SOW \ 

144F \ ■^3IT + W* 

This value is slightly larger than the value obtained when the adjusted varieties 
appear together in the same block in groups Z or 17, as should be the case. This 
gives us a lower limit. One can arrive at the upper limit in the following manner: 
suppose the variance (intra)i obtained in the intra-block analysis for the difference 
between two varietal means such as tn and th is greater than that for varietal 
means and (intra) 2 , then it follows that: 


(inter -f- intra)i ^ (inter + intra )2 X 


(intra)i 


... --- ^ (intra),' 

Using this relation, the upper limit for two varieties together in the same block 
in groups X or Y is 


Am 


24IF V ' 3IT + Wy 63 ’ 

which gives a value slightly greater than the formula derived, as it should if it 
is to be the upper limit. In a similai- manner one gets the variance for the differ¬ 
ence between varietal means not appearing together in the same block. 

5. Efficiency of the design to ffie randomized complete blocks. By the 
method outlined by Cochran [0] the efficiency can l)e shown to be measured by 
the ratio of 
k . 1 


w^w 


to 4 (average error variance of the difference between two plots). 


I ^ v V VA UAAV' VAAAA V'A V'AAV'V/ V TT j^A« 

It will be noted, by using the above formula, that the gain in efficiency for 
the numerical problem given in [1] is 1.003, which for our purpose here is zero. 
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This, in general, will not be the case, for on most soils there is a block difference. 
In this particular test the ground used had been previously filled in with well 
mixed soil. The efficiency for the analysis given in [1] relative to the randomized 
complete blocks was less than 1.00. 

This paper and the previous one show what a long tedious proc^edure is neces¬ 
sary to analyze the data, when the design does not follow the rules for the 
construction of the lattice, triple lattice, etc. The complexity of these methods 
stresses the importance, to those designing experiments, of not deviating from 
the established design if the most information is to be secured from the data with 
simple calculations. 
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FURTHER REMARKS ON LINKAGE THEORY IN 
MENDELIAN HEREDITY 

By Hilda Geiringer 
Wheaton College 

In the following an explicit formula for the distribution of genotypes in case of 
three Mendelian charact-ers ^^ill be given [formula (5)]. The complete discussion 
of the case m = 3 suggests a supplement (as stated in the last paragraph of this 
paper) to the general limit theorem dealing with m characters. 

In an earlier paper^ recurrence formulae have been derived whi(;h furnish the 
distribution of genotypes in the nth generation if the distribution in the (n — l)th 
generation and the '‘linkage distribution*' (l.d.) are known. It was also 
shown how to “integrate" this system of difference equations so as to determine 
the distribution in the nth generation directly from that in the 0th generation. 
This last method, though straightforward, requires however in each particular 
case quite a few operations. 

In case m, the number of Mendelian characters, equals two, an explicit 
formula for the problem in question had been known. Denote by p(xi , 

* Hilda Geiringer, Annals of Math, Slat. Vol. 16 (1944), pp.25-67. The notation 
in the present Note will be the same as in this paper. 
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m 

ixi,X 2 ^ 1,2, * * * k)f the ‘^distribution of transmitted genes” in the original, 0^, 
generation, by , Xt) that in the nth generation and by c the “crossover 
probability” (c.p,). Then the simple formula holds;* 

(1) p^”\xi , Xj) = (1 - c)"p(xi, Xi) + [1 - (1 - c)"]pi(xi)p8(xj). 

This may also be written; 

(10 p‘"’(Xl , Xt) = Pi(.Xi)pt{Xt) + (1 - cY[p{xi, Xj) - Pl(xi)p 2 (x*)], 

where are the marginal distributions derived from p{xi , x^, (1') shows 
that, if in case of independence of the original distribution, p(xi , 2 : 2 ) ~ Pi{x^p%{x^ 
then , x%) ~ p{xi , xi) for every n. The same is true for arbitrary p{x\ , xii 
if c = 0. Otherwise, if c > 0 the second term to the right in (!•) tends towards 
zero as n —> 00 and the well known limit theorem results. 

In case m = 3, a remarkably elegant explicit formula exists* which may be 
deduced from the author’s general theory. In this case the Ld. is completely 
equivalent to the three c.p.’s C 12 , C 28 , Cis. The c,y are probabilities with sum S 
2, and for which the triangular relation 

(2) C,y + Cjk ^ Cik 

holds. If Z(€i, € 2 , € 3 ) (ci = 0 , 1 ) denotes the eight values of the l.d. we have (see 
quot. [ 1 ], p. 32) 1(000) = Z(lll), Z(IOO) = Z(Oll), Z(OIO) « Z(lOl), Z(OOl) - Z(llO), 
hence three independent values only. We may introduce 

2Z(000) = v(000) = t;o, 2Z(100) = i;(100) = , 2Z(010) = v(0l0) « vt 

' 2Z(001) = f;(001) = 1 ^ 3 ; Vo + + V2 + Vs = 1. 

It follows easily that 

(4) Cij = Vi + Vj, {i 7 ^ j, i,j = 1, 2, 3). 

The original distribution p(xi , X 2 , Xs) has marginal distributions Pij(Xi , Xy), 
Pi(xi). These values will be denoted briefly by pm , ??i 2 , Pm , Pis, Pi, P 2 , Ps 

respective^. Writing in an analogous way (xiXsXs) == Pii* the new formula is 

the following; 

PISJ = PiPjPj + [(wo + t»i)" - t;o“](piPM - PiPsPj) + [(»o + ViY - p.'KpjPu 

(5) 

- P1P2P8). + [(Vo + Vs)” - vJKpsPlS - PlPsPs) + Vo”(pi 38 ~ PlPlPs). 

This useful formula permits to compute readily pi^z for every n. In terms of the 
Cij , writing 

(6) dij = 1 — C,y , Vo = 1 — iici 2 + Cis + Cis), 
it reads 

(50 P12V = piPsps + (dSz — vi*)(pip 28 — PiPjsPs) H— vS(pm P1P2P3). 

*H. S. Jennings, Omeiics, Vol. 12 (1917) pp. 97-154. 

»Professor Felix Bernstein called this author^s attention to the biologically interesting 
case m « 3. 
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In these formulae the role of independence of the original distribution is clearly 
seen; If * p.Pi and Pm = pipaps then pm = Pm for every n and every LcL 
The same holds for every n and every pm if = 1, which implies that all ca be 
zero. If in (5') all da < 1, hence all c*i > 0 the limit theorem lim p^i = 

n-*oo 

PiPaPa results. ' c,y > 0 means that complete linkage between any two genes is 
excluded. If, on the other hand, e.g. t^o > 0, vi > 0, = ^23 = 1 , C 28 = 0, 

hence vo < 1, 2^2 = t >3 = 0 we get p[ii piP 28 . If C 28 == C 12 == 0 the triangular 
relation (2) shows that cia = 0 too, a case considered above. 

It should be noticed that (5) is, of course, in agreement with the author’s 
equation (41) in quot. [1]. It only has to be observed,—an obvious fact not 
mentioned in mjr earlier paper,—that in the former setup the sum of all the 
for every fixed m equals one. Thus for m = 3: 

(7) a[ 2 z -f- ai .28 + aiHl -j- 0 : 3,12 + ofi, 2,8 = I, (for every n), 
and 

(8) 01123 = Vo, Oil,28 = (Vo + 2 ^ 1 )” — 1^0 = diz Vq • 

OfllS = ( 2^0 "h 2 ^ 2 )” 2^0* ~ di3 — Vo^ 

0:8,12 == (vo “h Vs) — I’o ~ ^”2 Vo- 

The proceeding complete discussion of the case m = 3 suggests a remark 
concerning the general case of m characters. In my earlier paper the influence 
on the main limit theorem of certain ways of degeneration of the l.d. had not been 
explicitly considered. In the follomng we shall use the i^-distribution which 
is a little shorter to write than the l.d. l(€i , € 2 , • • • €m). The ^-distribution con¬ 
tains only 2”*~^ values with sum one, defined in a way similar to (3). The main 
limit theorem ([1], theorem II, p. 42) states in our present notation that 

(9) lim = PiPs- ■ -Pm, 

n-*90 

if “complete linkage” between any group of genes is excluded. That implies 
that not only Vo ^ v(0,0, • • • 0) = 1 must be excluded but even Vtj...fc(0, • • • 0) = 
1, where this last probability denotes a marginal distribution of the v-distribution 
of an order ^2. To assure this it is necessary and sufiicient that nov;y(0,0) = 1, 
ornod,iSViy(0,0) = 1, or no Ciy = 0. Hence (9) holds if and only if no c,j = 0. 
If this condition is not satisfied the l.d. degenerates in various ways and the limit 
theorem is to be modified accordingly. If, in particular, vo = 1, all Cij = 0, and 
Pl 2 "^ -m == pi2...m for every n. 

Between these two extreme cases (“no c,, = 0”, “all Cij = 0”) are the different 
pK)ssibilities of r < m groups of completely linked-characters (see [1] p. 36, iv)). 
Consider e.g. m = 7 and vi234(0000) = 1, V6e7(000) = 1 (this is realized if 
v(OOOOOOO) > 0,v(0000111) > 0 with sum of these two numbers equal to one) then 
lim pif}. .7 *= Pi284 P687 . Here the four characters 1 , 2,3,4 act as one character and 

n-*ao 

piiii = Pmt for every n. Also pwV = Pm ■ Or if, for m = 6, du = rfs. = rfw = 1 
(realized if y(OOOOOO) > 0, e;(l 10000) > 0, r(001100) > 0, w(OOOOll) > 0, with 
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the sum the^ four values equal to one) then - HhffWeirer 

for m = 6 merely (fu — (fs 4 ~ 1 (realized if, in a notation analogous to (8), vb, vt, 
V 9 , Vt 6 , V 12 , Vu , vm, vm are the only non-zero values of thel.d.) then^ 
PisPwPsPe. 

In general, with a proof which consists in a modification of the reasoning (p. 
41), of my earlier paper, we may state the following complement to the main 
limit theorem (9): If the Id. is such that r < m disjoint groups , (?2 »• * * 
of completely linked characters exist, i.e. sitch that within each group no crossover 
takes place, each group containing as many of the m numbers as compatible with the 
definition hut not less than two, and all groups together containing s ^ m of the m 
elements, then, as n 00 , pj converges towards the product of those marginal 
distributions (of the original generation) which correspond to these groups multiplied 
by the marginal distributions of order one of the remaining free elements which are not 
contained in any such group. In a formula: 

(10) hui == PGj Po2---POr Pt« + iP7ji + 2-••P'ym * 

n-*oo 

We may also chai*acterize these linked groups of maximum size by stating that 
while within each group no crossover takes place there must be at least one c.p. 9 ^ 
0 among any two such groups and at least one among any group and any free 
element. It may however be noted that if there is one c.p. > 0 among two 
groups of complete linkage (or among a group and a free element) then all c.p.'s 
among these two groups are different from zero. In fact, it follows by repeated 
use of the triangular relation (2) that if one c.p. among two disjoint groups of 
complete linkage is zero, all of them are zero. If, e.g., (1 , 2 , 3) and ( 5 , 6 , 8) are two 
groups of complete linkage, i.e. i;i 23 ( 000 ) = 1 and «;668(000) = 1 and if besides 
C 16 = 0, then 1 ^ 123668 ( 000000 ) = 1 and these six elements form a group of complete 
linkage. 

It may be noticed that the above statement of the generalized limit theorem 
becomes simpler and more elegant by counting ^‘free elements” as groups. It 
might then run as follows: If Gi, G 2 , * • • Gt(t ^ m) are the maximal groups of 
completely linked characters, then, under the hypotheses of the earlier paper, the gene 
distribution in successive generations approaches a limit in which the original (mar-- 
ginal) probabilities within each group G, are preserved and genes and sets of genes 
fromd ifferent groups are independently distributed. 


ON THE DEFINITION OF DISTANCE IN THE THEORY OF THE GENE 

By Hilda Geiringer 
Wheaton College 

In several letters to this author Dr. I. M. H. Etherington of the University of 
Edinburgh has raised questions concerning the author^s definition of ^‘distance” 
proposed in Section 10 of her paper on Mendelian heredity,^ comparing it with 


1 AnnaU 0 } Math. Stat., Vol. 16 (1944), pp, 26-57. 
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the definition implicit in Professor J. B. S. Haldane’s earlier treatment.* The 
main content of the author’s paper consists of some general limit theorems and 
the integration of a certain system of difference equations. The distance defini¬ 
tion is a by-product subject to discussion. 

‘"Distance” dij between two genes i and j is defined by the author as the 
mathematical expectation of the number of crossovers in the interval {i,j) with 
respect to the “linkage distribution” (l.d.). This basic concept is introduced 
as follows (page 32): If iS is the set of numbers 1, 2, • • • m (m being the number 
of Mendelian characters), A any subset of A and A' = 5 — -4, we denote by 
1{A) the probability that an individual with “maternal” genes xi, • • • ,Xm 
and paternal genes yi, ' transmit the paternal genes belonging to A and the 
maternal genes belonging to A\ These 2”^ probabilities constitute the l.d. 
From these definitions the equality (G. (53')) 

(1) dij = + • * • + CjL-i./ (t < j) 

is derived, where Cij is the probability of a “crossover” (c.p.) in This 

distance has the required additivity: (G. (54)) 

(2) dij + dfk = dik , (i <j< k). 

Etherington points out that the term “distance” has an established currency 
in genetics being the basis on which chromosome maps arc constructed, and 
that there is a standard method of calculating it in accordance with which (1) 
is an “approximation valid only when the adjacent c.p.’s arc small.” Moreover 
“the biological uniqueness has been lost for the value of dij now depends on the 
particular set of intermediate genes which we happen k) be considering. If any 
of them are omitted from consideration then the inequality (G. (13)). 

(3) Cij + Cjh ^ Cik 

shows that in general dij is diminished while if new genes are taken into con¬ 
sideration dij may increase.” “In order that dij should not depend on a particu¬ 
lar choice of intermediate genes the word ‘crossover’ in the definition given would 
have to be interpreted as ‘chiasma’ instead of ‘odd number of chiasmata’; <and 
then dij cannot be evaluated in terms of the l.d. alone without further assump¬ 
tions regarding the interference of crossovers.” 

The point of view adopted in the author’s paper was to regard the l.d, as the 
basis from which ever 3 dhing else has to be inferred. The number m of Men- 
d^ian characters is considered constant and the distance, being a mathematical 
expectation with respect to the l.d. necessarily depends on it. In this conception 
distance is not a geometric property which can be measured for any two genes 
independently but rather a system of m(m — l)/2 consistent numbers associated 
to the m genes. There is no choice regarding the intermediate genes to be taken 
into consideration; all known genes are to be considered, i.e. one has to use the 
available relevant information in order to determine the l.d., the c.p.’s and the 

* Quotation [4a] in the author’s paper. Heferences to these papers will be distinauished 
by the initials H and O. 
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distances. If the information is incomideie the results will be pfoyimonal and 
subject to change; if it is satirfactory the same be true for the distance^. 
Thus it is nothing but natural that di^ is changed if scmxe genes are omitted from 
consideration, or if new genes are discovered. In this set up '^croosover”— 
defined by means of the marginal distributions of second order of the l.d.—^means 
a transition from the paternal to the maternal set or vice versa. (Expressed 
in terms of the chiasma-hypothesis this means *^odd number of chiasmata 
between adjacent genes.”) Additional assumptions “regarding the interf^enee 
of crossovers” are neither necessary nor admissible. All this is contained in the 
l.d. 

Haldane’s approach as translated by Etherington into the author’s notation 
is as follows. ‘'The genes are considered to be distributed continuously along a 
chromosome. Thus this approach unlike G.’s is not based on the l.d. of a 
finite set of genes. We must think of one suffix, as referring to a gene at a 
fixed locus on the chromosome, the others to variable loci, so that the c.p*’s 
are variable. For any three genes t, j, k a quantity p is defined by the equaticm 

(4) Cik = Cij + Cjk pcijCjit , (t K j < k)^ 

Biological considerations show that p is a number between 0 and 2 (small when 
Cij and Cjk are both small, increasing, on the whole, with + Cj^). The distance 
Dij is defined by the statement 


(5) Dkj/ckj “^1 as approaches y {ckj —+ 0), 

together with the additive property, and from this with (4) Haldane’s general 
distance expression is derived: 


( 6 ) 


Ay 



dcjj 

- PoCii 


Here po ^ Po(cij) denotes the limiting form of p when k approaches j, and repre¬ 
sents biologically a property of the chromosome segment ( 1 , 7 ), a measure of 
interference. Any suitable specification of this^function Po(cij) would constitute 
a mathematical ‘model’ of the chromosome. If p were constant we should 
have Po — p and 


(7) jD(/ = - ^ log (1 - pcij). 

P 

Both Haldane and Geiringer considered the special cases p = 2 (no interference) 
and p = 0 (complete interference) for which respectively 


(7') 


7>.y * - i log (1 - i Cij) 


(7'0 Dij =« Cij = dij. 

Since p is always between 0 and 2 Haldane concludes that the true value of Dij 
is between (7') and (7”), and he gives reasons for sa 3 ing that (7') is nearly correct 
for genes ‘far apart,’ (7”) for genes ‘close together.’ ” 
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If the author is right, this seems to be the standard definition accepted in 
genetics as mentioned above by Etherington. A few, not exhaustive, comments 
may be added. Writing in (6) t for the variable of integration and po = Po(0 
it is seen that the expression 


( 6 ) 




r*' dt 

'o 1 -1^) 


c'ontains the unknown function po(0, which is unspecified except for the state¬ 
ment that it is bounded between 0 and 2. It is immediately seen that with an 
arbitraiy po(0 and without a restriction taking the place of (4) this distance (6) 
will not be additive in the sense of (2). By imposing, after a choice of po(0, 
appropriate restrictions on the Cij additivity may be achieved. For instance in 
the particular case pc(0 == P ~ const, (2) holds by virtue of (4). For such a set 
of restrictions it has then to l>e proved that the corresponding “modeP’ is “con¬ 
sistent,’’ i.e. that the so restricted c.p.’s form a compatible set of marginal 
distributions of second order of an m-variate distribution, the l.d. 

These different points will be exemplified presently by studying the particular 
case po(0 = p, where p is a suitably chosen constant; the parameter p is to be 
fitted to the observations under consideration. It may be impossible to repro¬ 
duce a set of observations satisfactorily if one parameter only is available. In 
fact, Haldane’s paper suggests that it is not only the particular case p = const 
he has in mind. It seems however that if Da is given by (0) with a non constant 
Po(0, comi^icated and perhaps (biologically) not very meaningful conditions may 
have to be introduced in order to assure additivity of the distances and con¬ 
sistency of the respecitive model. This author was unable to work out examples 
of more general and at the same time appropriate and fairly simple assumptions 
for the unknown function po(0- 

If p — const, then (7) under the restriction (4) furnishes an additive distance 
definition because: 


- p{Dii + Djk] ^ log (1 - pci,) + log (1 -- pc, a) 

= log (1 - pcij - pc,A + p^CijCjk) = log (1 ~ pca) = - pDik, 

because of (4). I^et us now investigate whether there is a consistent system of 
c.p.’s satisfying (4). Put, as in G.(48), c,,,^-! = p, , combine (4) withG.(50) and 
write p = 2c. It follows that (4) is satisfied with 0 ^ e g 1, if: 

( 8 ) Pa = epiPj , pijk = €%PiPk , • • • . 

Here pij is the probability of the simultaneous occurrence of the “events” 
numbered i and j, etc. For c = 0 we get “disjoint events” (see G.f) for the 
discussion of consistency). Assume now" c > 0. By some considerations, 
analogous to those p. 54 G, the following necessary and sufficient condition of 
consistency follow^s: 

fn- I 

n (1 - ept) s 1 - 

t-1 


(9) 


€ 


(* > 0 ). 
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This restriction (not considered by Haldane or Etherington) is» of course, 
relevant. If e.g, w = 3, pi — p 2 ~ 4/5, then c must be ^ 15/16; orU m «= 4, 
Pi 2>a = Pa = i, € ^ 3 — \/6 results. The restriction required by the “linear 
theory” is 

(10) I’' ^ ^ ’ (t = 1, 2, • • • , m - 1). 

Hence this model is consistent imder certain restrictions. It is, in contrast 
to Etherington’s contention, different from iii) G. p. 54. The corresponding 
distance definition (7) is different from the author’s. The Da thus defined are 
additive, and Da depends on c,y only and not on the intermediate genes. The 
author’s definition of distances, dij , is general, additive and seems to the author 
to be well adapted to the biological situation; since the definition of da is not 
related to any particular model it is compatible with any model, which may 
contain any desired—consistent—assumptions about “interference,” etc. For 
example in G. iv) p. 55, an n-parametric model has been suggested which seems 
fairly flexible. 

It may however seem more acceptable to the biologist not to use a general 
distance definition but to define “distance” merely in relation to some sufficiently 
general “model” (such that the distance definition would vary with the model), 
instead of accepting an all-over definition as ventured in the author’s paper. 
The particular model (8) in connection with its related distance definition (7) 
might give an example of such an approach.^’ * 

’ As Etherington remarks, eq. (14') in the author's original paper is not correct. One 
can only state that (47) holds. The mistake is however without consequence since no 
conclusions are drawn from (14'). The same mistake was pointed out by Professor Kai 
Lai Chung. 

* Etherington writes: “I have been kindly allowed to read Professor Geiringer’s MS, 
and feel that some comrnents are necessary. 

The standard procedure for calculating the distance between tw'o linked genes is as 
follows. A selection of intermediate genes is taken and the adjacent crossover values 
calculated, giving a provisional estimate of the distance as in Geiringer's formula (1). 
When further intermediate genes are added to the selection, it is found that the provisional 
distance increases, but there is apparently a maximum value beyond which it cannot be 
increased. This unknown maximum value is the distance, and the geneticist accepts (1) 
as the distance when he is sure that he has observed a sufficient number of intermediate 
genes to give a good enough approximation to the true distance. Thus Geiringer's formula 
(1) gives the geneticist's true distance only on the understanding that it includes all genes 
intermediate between i and j; but generally speaking the great majority of these genes 
may be unobservable in the sense that they have no observably distinct alleles by means 
of w'hich the c.p.'s could be calculated, though from time to time fresh genes may beco^ie 
observable by mutation. 

In some cases the above procedure fails because not enough intermediate genes can be 
observed; then Haldane's analysis is useful. It should be emphasized that his distance is 
additive by definition. (For a geometrical analogy, think of the genes as points closely 
distributed along a curve, chords representing c.p.’s. Haldane’s definition of the distance 
is analogous to defining arc length of the curve as a limiting sum of chords.) In my tran- 
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scription of his treatment, I should perhaps have made it clearer that the derived formula 
(6) gives only the distance measured from the imtially chosen and fixed gene i to an 
arbitrary gene j. Other distances D/*, (t < ; < A;), are deduced from it by the postulate 
of additivity (O,* » Da — Do), If the origin i is changed, there will be a similar formula 
(6), but it should not be assumed that the function po is the same. In referring to certain 
conditions necessary *to assure additivity,' Geiringer evidently means conditions that the 
function po may be the same for all origins i. These conditions would be interpreted bio¬ 
logically as asserting uniformity of interference along the chromosome. I agree that there 
are further points to be cleared up in this connection. 

If I might sum up the discussion, I would say that the geneticist's conception of the 
distance between genes is an actual property of the corresponding chromosome segment. 
Geiringer's definition represents the best possible general approach to this from the limited 
data of the l.d. alone. Haldane's definition fits the geneticist's conception, and his in¬ 
vestigation is an attempt to get the best estimate of the distance by making approximate 
assumptions as to what happens between the observed genes. It is based on the unob¬ 
servable crossover-distribution of a supposed infinite set of genes, but can be applied to 
particular models of this infinite c.d. so as to derive results which involve only a finite and 
observable c.d. Finally it should be mentioned that in the paper quoted, Haldane gave 
also an alternative method for the case p « 2, leading to the same formula (7'), which is 
really equivalent to defining the distance as the mathematical expectation of the number of 
chiasmata (not crossovers in G.'s sense) in the interval (t, j) 


A CRITERION OF CONVERGENCE FOR THE CLASSICAL ITERATIVE 
METHOD OF SOLVING LINEAR SIMULTANEOUS EQUATIONS 

By Clifford E. Berry 

Consolidated Engineering Corporationy Pasadena, Calif, 

The recent development of two devices*’ * for solving linear simultaneous 
equations by means of the classical iterative method’* has stimulated the writer 
to investigate convergence criteria for the method. There are in the literature* 
necessary and sufficient criteria for convergence of symmetric systems, and suf¬ 
ficiency criteria for general systems. So far as the writer knows, however, this 
is the first development of a necessary and sufficient criterion for convergence 
in the general case. The results obtained are applicable to any arbitrary square 
non-singular matrix in which an 0. 

Let the set of equations be represented by 

(1) AX - Gy 

^ Morgan, T. D., Crawford, F. W., “Time-saving computing instruments designed 
for spectroscopic analysis”, The Oil and Gas Journal, August 26 (1944), pp. 100-105. 

*^rry, C. E., Wilcox, D. E., Rock, S. M,, Washburn, H. W., “A computer for solv¬ 
ing linear simultaneous equations”, to be published. 

® Hotelling, Harold, “Some new methods in matrix calculation”, The Annals of Math¬ 
ematical Statistics, Vol. XIV (1943), pp. 1-34. 

^Mises, R. von and Pollaczek-Geiringer, Hilda, “Zusammenfassende Berichte. Prak- 
tische ,Verfahren der Gleichungsauflosung”. Zeitschrift fUr angewandte Math, und Me^ 
chanik, Vol. 9 (1929), pp. 68-77, and 152-164. 
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in which A is the square matrix of the coefficients, X is the column imtrix (si the 
imknowns, and G is the column matrix of the constant terms. | A | is the de* 
terminant of-A. 

We define a matrix Ai which contains the prediagonal and diag<>Q^ terms of A, 
and a matrix A 2 Ivhich contains the postdiagonal terms of A, According to this 
definition, 

(2) Ai + Aa = A. 

In the classical iterative method, arbitrary (or approximate) values of the x’s 
are chosen, the first equation is solved for the first unknown, the second equation 
for the second unknown, etc., using in each equation the most recent approxirnsr 
tions to the re’s. This process may be written 

(3) = G, 

in which is the initial approximation matrix, and is the ai^roximation 
matrix existing at the end of the first iterative cycle. The superscripts indicate 
the number of the approximation. The next cycle is described by 

(4) + il,X‘" = G, 
and the mth by 

(5) liX'"’ + AjX'"*"" = G. 

The method yields a solution, i.e., converges, if 

lim (X‘"’ - X) = 0. 

Solving (5) explicitly for X' 

(6) X'"” = - Ar'AsX'""". 

Subtracting X from each side, 

(7) X‘"’ - X = AT^G - Ar'AjX'”^” - X, 
and making use of (1) and (2) 

(8) X'”’ - X = -Ar'A.CX^"-” - X). 

Since (8) applies for any value of m, we may write 

(9) x'"*’ - X = - X), 

and continuing this process, 

(10) X*”’ - X = (-Ar'A 2 )“(X® - X). 

Now, lim (X‘"’ - X) = 0 if and only if 


(11) 


lim (-Ar*A,)" = 0. 
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This, is a general result, applicable to any arrangement of the terms of an ar¬ 
bitrary square matrix A, subject only to the conditions that \ A I 9 ^ 0 and that 
no diagonal term of A is zero. In this latter exceptional case, the iterative 
method itself obviously cannot be applied. 

The criterion (11) clearly shows that the order in which the elements of the 
matrix A are arranged is important. For instance, it is plain that an arrange¬ 
ment in which the diagonal terms are large and'the off-diagonal terms, particu¬ 
larly the post-diagonal terms, are small will tend to favor convergence. 

A somewhat relaxed condition, which is sufficient but not necessary, is ob¬ 
tained through the use of an inequality used by Hotelling®, namely, 

( 12 ) < [N(B)r, 

in which N{B) is the norm of the matrix J5, that is, the square root of the sum 
of the products of its elements by their complex conjugates, or in the case of a 
real matrix the square root of the sum of the squares of the elements. 

The condition is that, if 

(13) NiAr^A^) < 1, 
then 

(14) lim (Ai^A^r = 0. 

m—*00 

Criterion (13) is readily computed, since Ar\ the reciprocal of a triangular 
matrix is readily computed, and the post-multiplication by A 2 involves a number 
of zero terms. 

A more stringent condition than (13) though still not a necessary condition, 
is that if some finite number p can be found such that 

(16) NiAT^A^y < 1, 

then (14) follows. Since n matrix squarings result in a value of p = 2", the size 
of the norm for fairly large values of p can be investigated without excessive 
labor. 


A REMARK ON INDEPENDENCE OF LINEAR AND QUADRATIC 
FORMS INVOLVING INDEPENDENT GAUSSIAN VARIABLES 

By M. Kao 
Cornell University 

The purpose of this note is to call attention to the following useful theorem, 
which to the best of my knowledge was never stated explicitly. 

If Xi, X2, Xz, • • • Xn are identically distributed, independent Gaussian random 
variables each having mean 0, then the necessary and sufficient condition that 

n n 

ajkXjXk and atjXf » ofX 

?.*-! y-1 
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be independetU, is that 

Aa ^ 0, 

where A ia the matrix of the qvadratie form, a the vector (oi, at, ••• , «») andX fhe 
vector {Xi, Xj, 

Proof of sufficiency.* Since Aa = 0, it follows that 0 is an eigenvalue of A, 
and a is a corresponding eigenvector. 

Denoting by Xj , X« the remaining eigenvalues and by ft ,• • • , |8» the 
corresponding eigenvectors, we have 

t = t m-xf. 

j.*-l j-2 

Since the /S’s are orthogonal to a, it follows that the linear combinations 
are independent of a • X, and this completes the proof. 

Proof of necessity. From the assumption of independence it follows that 

n / n \ 2 n 

ajkXjXk and * Y ^j^^^X^Xk 

are independent. Thus by Craig^s theorem^ 

AB = 0 

wherein = {{ajak))> 

This implies almost immediately that Aa = 0. 

* Added in proof: Dr. L. Guttman has kindly pointed out to me that the proof of 
sufficiency given here has been used by D. Jackson in the article “Mathematical principles 
in the theory of small samples”, Amer. Math. Month., Vol. 42 (1936), pp. 344-364, see in 
particular pp. 354-355, Jackson considers only the independence of i and «*, which is of 
crucial importance in deriving student^s distribution. 

*A.T. Craig, Annals of Math. Slat., Vol. 14 (1943), pp. 195-197; see also H. Hotel¬ 
ling, ibid., Vol. 15 (1944), pp. 427-429. 
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Presented on September 16, 1945 at the Rutgers meeting of the Institute 

1. On The Variance of a Random Set in n Dimenaiona. Herbert Robbins, 
Lieutenant USNR Postgraduate School, Annapolis, Md. 

Using a general formula for the moments of the measure of a random set X (Ann. Math. 
Stat. Vol. XV (1944), pp. 70-74) we find the mean and variance in the case where X is a 
random sum of n-dimensional intervals with sides parallel to the coordinate axes, thus gen¬ 
eralizing the results previously found (loc. cit.) for the case n 1. 

2. The Non-Central Wishart Distribution and its Application to Problems in 
Multivariate Statistics. T. W. Anderson, Princeton University. 

The non-central Wishart distribution is the joint distribution of sums of squares and 
cross-products of deviations of observations from multivariate normal distributions with 
identical variance-covariance matrices and with different sets of means. The rank of the 
non-central Wishart distribution is defined as the rank of the matrix of sets of means. In a 
previous paper (by M. A. Girschick and the present author) the non-central Wishart dis¬ 
tribution is given explicitly for the rank one and two cases and indicated for the case of any 
rank. In the present paper the characteristic function of the non-central Wishart distribu¬ 
tion is given for general rank. The distribution, which is given in the form of a multiple 
integral, is the product of a central Wishart distribution and a symmetric function of the 
roots of a determinantal equation inUblving the matrix of squares and cross products of 
observations and the matrix of population means. It is shown that the convolution of two 
non-central Wishart distributions is again a non-central Wishart distribution if the vari¬ 
ance-covariance matrices are the same. The moments of the generalized variance and the 
moments of the likelihood ratio criterion for testing certain linear hypotheses (for example, 
the hypothesis that the means of a set of populations are identical, given that the matrices 
of population variances and covariances are the same) are obtained for the linear and planar 
non-central cases in terms of infinite scries. Likelihood ratio criteria arc developed for 
testing the dimensionality of the means of a set of multivariate populations (with identical 
variances and covariances) on the basis of one sample from each. The criterion for testing 
whether the dimensionality is h in the space of p dimensions is a symmetric function of p—h 
smallest roots of the determinantal equation involving the sample estimate of the matrix 
of variances and covariances and the sums of squares and cross-products of deviations of 
sample means. The maximum likelihood estimate of the hyperplanes and positions of 
means on them are obtained. The asymptotic distributions of the criteria are x®- 
distributions. 

3. The Effect on a Distribution Function of Small Changes in the Population 
Function. Burton H. Camp, Wesleyan University. 

It is generally assumed in the application of distribution theory that, if the actual popu¬ 
lation function is not very different Trom the one used in the theory, then the true sampling 
distribution of a statistic will not be very different from the one obtained in the theory. 
But elsewhere in mathematics we do not assert that a conclusion will be only slightly modi¬ 
fied by a small deviation in the hypothesis. This paper presents some theorems which are 
useful in determining the maximum effect on a sampling distribution of certain kinds of 
small changes in the population function. 
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4. Compoilte X>i8tiibiiti0iii. Cabper Goffmak and Benjamin EFB*mNf 

inghouse Electric Corporation. 

Let/(as; 0i, St, * • > , dn) be a function such that for every point 0i Sit, ' ^ «• tfno in 

parameter space, £ is a random variable with p.d.f. f(x; ^lo, *** , 0no)» Suppose further 
that the parameters • , 9i» are themselves random variables whose p.d.f.’s are 

given respectively by ^(Ji) , • • • 0(9 h). Using a concept of '‘probability contained in an 
intervar* and an axiom based on this concept, we show that £ is a random variable with 
p.d.f. g(x) given by the formula 

(1) gM • f ••* /* f(x;0i , 0n)^/>(0l) • • ^f>(0n) d^i ••• d0n . 

In this paper we consider statistical properties of the function g(x) in cases of particular 
interest in applications. The cases treated here are (a) where the mean, is the only vari* 
able parameter, (b) where the standard deviation, <r, is the only variable parameter, and 
(c) where the mean f, and the standard deviation, <r, are both variable parameters; £ and <r 
being independent. 

It is shown that problems (a) and (b) are equivalent respectively to the sum and product 
of two independent random variables, one of which has zero mean. Formulae for the 
moments in problem (c) are then derived in terms of the formulae obtained for (a) and (b). 

5. Population, Expected Values and Sample. E. J. Gumbel, New School for 
Social Research. 

I^et £ be an unlimited continuous variate, and let F(x) be the prolmbilily of a value equal 
to, or less than, x. Then the expected values , for n observations, are approxima¬ 
tions to the most probable m*** values and defined by F(£m) * -f (Fh — Fi) (m — 1)/ 
(n ~ 1), where Fi and Fn are the probabilities of the most probable first and the most prob¬ 
able last value. The probabilities Fi , 1 — and (Fn — F\)/{n — 1) are of the order of 
magnitude 1/n. 

The distribution of the expected values £», differs from the distribution of the sample 
and from the theoretical distribution. However, for a symmetrical distribution the mean 
and the odd moments about mean calculated from the expected values coincide with the 
mean and the moments of the population. For the normal.distribution, the expected 
standard deviation <r(n) divided by the standard deviation a of t he pop ulation and traced 
on normal probability paper approximates a linear fuftction of \/log n. The approach of 
<r(n) toward <r is slow. For 500 observations, <r(7i) is about 99% of tr. The moments of the 
distribution of the expected values exist even in the case that the moments of the theoretical 
distribution diverge. 

6. On Optimum Estimates for Stratified Samples. Morris H. Hansen and 
William N. Hitrwitz, Bureau of the Census. 

A stratified sample is drawn from a population with R strata. Neyman found the op¬ 
timum sample allocation for the “best unbiased linear estimate.“ However, biased but 

consistent estimates of the form —^ where both £< and y,- are random variables have been 

Vi 

found to give more reliable results in a large class of problems. Even more efficient esti¬ 
mates can be obtained by finding the values of (the sample size) and Wi which minimize 

Zi XWiXi 

the mean square error of estimates of the form Xwi -■ or-. 

2W. Vi 
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7. Pearsonkn Correlation Coefficients Aasociated witii Least Squares Theory, 

Paul S. Dwyer, University of Michigan. (Read by Title): 

In least squares theory we have the predicting variable a;, the obiwrved value of the 
predicted vaiiable, y, the residual e, and the predicted value of the predicted variable y. 
The purpose of this paper is to study the Pearsonian coefficients resulting from correlating 
all these variables in pairs (a) in the case of a single predicted variable and (b) in the case 
of two or more predicted variables. The results yield such coefficients as multiple correla¬ 
tion, multiple alienation, partial correlation, part correlation, and new coefficients not 
previously in use. The results are given in expanded, determinant, and matrix form. A 
simplified calculations! technique is provided. 



NEWS AND NOTICES 

Readers are invited to submit to the Secretary of the Institute new items of interest 

Personal Items 

Dr. Kenneth Arnold, recently with the Statistical Research Group, Columbia 
University, has accepted an assistant professorship in Mathematics at the Uni¬ 
versity of Wisconsin. 

Dr. Leo Aroian has returned to his position at Hunter College after serving as 
Research Associate in the Applied Mathematics Panel Project at the University 
of California. 

Mr. Geoffry Beall is now statistician for the Institute of Paper Chemistry at 
Appleton, Wisconsin. 

Mr. Robert E. Breden has accepted a position with the Personnel Research 
Department of Proctor and Gamble at Cincinnati. 

Mr. William F. Elkin, who has been Social Science Analyst with the Vital 
Statistics Di\ision of the Bureau of the Census, has accepted a position as Vital 
Statistician at Oak Ridge, Tenn. 

Mr. Robert M. Ewing of the IT. S. Rubber Company has been transferred to 
Detroit. He now serves in the capacity of Tire Development Engineer. 

Dr. A. S, Householder, formerly of the University of Chicago, is now with the 
Fire (Control Division of the Naval Research Laboratory in Washington. 

Dr. Irving Kaplansky has been appointed to an assistant professorship of 
mathematics at the Univei*sity of (Chicago. 

Mr. Amrom H. Katz has been promoted from Associate Physicist to Physicist 
at the Aerial Photographic Laboratory at Wright Field. 

Dr. William G. Madow' of the Bureau of the Census will serve as Visiting 
Professor of Statistics at the University of Sao Paulo, Brazil, for the full academic 
year w'hich begins on March 16. He expects to return to the United States in 
January of 1947. 

Dr. J. E. Morton, formerly of Knox (College, has joined the staff of the National 
Bureau of Economic Research. 

Dr. A. C. Olshen has returned from his navy w ork in Washington to his position 
as Actuary and (^hief Examiner of the Oregon Insurance Departnient at Salem, 
Oregon. 

Mr. Joseph S. Rhodes (formerly Joseph Rosenthal) now' holds the position of 
Sampling Specialist in the Bureau of the Census. 

Prof. Paul R. Rider, on leave from Washington University, is teaching at 
Shrivenham American LTniversity in England. 

Dr. J. Wolfowitz has accepted an associate professorship in Statistics at North 
Carolina State College. Professor Wolfowitz is serving as Associate Editor of 
the Journal of the American Statistical Association, 
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New Members 

Ibe following persons have been elected to membership in the Institute: 

Astrachan, Asso. Prof. Max, Ph.D. (Brown) Antioch College, Yellow Springs, Ohio. 

Bales, R. P., B.A. (Toronto) Tech. Sup., Dominion Rubber Co., St. Jerome, Que. Can. 

Bafios, Olegarlo Fernandez, D.C. (Madrid) Catedratico, Univ. of Madrid, Calle Lopez 
de Hoyos 7, Spain. 

Barkan, Herbert, M.A. (Columbia) Econ. Analyst, 60 8th Ave., Brooklyn, N. Y. 

Bloom, Royal F., M.A. (Minnesota) Lt. Comdr. USNR, Test and Res. Section, Bureau of 
Naval Personnel, 61 O. Ridge Roadf Greenbelt^ Md, 

Blommers, Paul J., Ph.D. (Iowa) Univ. Examiner and Registrar, 114 Univ. Hall, State 
Univ. of Iowa, Iowa City, Iowa. 

Brier, Glenn, A.M. (George Washington) Meteorologist, US Weather Bureau, Washington 
26. D. C. 

Brlzey, Nancy, B.A. (Vassar) Economists. Davis and Gilbert Law Firm, 1 E. 44 St., 70 
East 77 St. New York £1, N. Y. 

Caplan, Benjamin, Ph.D. (Chicago) Econ., OPA, 2831 28th St. N.W., Washington 8, D. C. 

Chassan, Jack, B.S. (C.C.N.Y.) Stat., Office of Stat. Control, Hdq. A.A.F. 3013 30th St. 
S.E., Washington, D. C. 

Cornell, Dr. F. G., Ph.D. (Columbia) U. S. Office of Ed., Tempo. M. 26th and Water, N.W., 
Washington, D. C. 

Cram6r, Prof. Harald, Ph.D. (Stockholm) Skarviksv&gen 7, Dursholm, Sweden. 

Dempsey, William B., Ph.D. (Harvard) Regent of the School of Commerce and Finance, 
Saint Louis Univ., S674 Lindell Blvd.^ St. Louis 8, Mo. 

Derrick, Asst. Prof. Luclle, M.A. (Peabody) Univ. of Chicago, School of Business, 664S 
Kimbarkf Chicago J7, III. 

Dominguez, Bmilia A., Ec.S. (Buenos Aires) Actuary, Supt. Personas Juridicas de Buenos 
Aires, Martinez Castro 76S, Buenos AireSy Argentina. 

Dominguez, Jose F., Ec. S. (Buenos Aires) Tech. Council Institute Nacional de Prevision 
Social, Martinez Castro 765, Buenos Aires, Argentina 

Duncan, Asst. Prof. Acheson, Ph.D. (Princeton) Econ. Dept., Princeton Univ., Princeton, 
N. J. 

Dyson, John D., B.S. (South Dakota State) Major, U. S. Army, Fitzsimons Gen. Hosp., 
Denver, 108 S. Jefferson, Pierre, So. Dak. 

Elmore, Francis B., B.S. (Clemson) Capt., Ord. Dept. Inspection of Ammunition, 505 
Kingston Drive, St. Louis iS, Mo. 

Franzen, Raymond, Ph.D. (Columbia) Stat. Consultant, 10 Rockefeller Plaza, New York 
20, N. Y. 

Friedman, Bernard, Ph.D. (Mass. Inst. Tech.) Res. Math., A.M.P.,.N.Y.U., $741 81 St. 
Jackson Heights, N. Y. 

Gordon, J. J., Staff Stat. Eng. Quality Control, Western Electric Company, Inc., 100 Central 
Ave., Kearny, New Jersey. 

Gough, Elsie L., M.A. (Michigan) Auditing Clerk, 648 Blvd. Way, Oakland 10, Calif. 

Greene, Kenneth E., B.S. (Yale) Asst. Res. Mgr., Nat. Broadcasting, 4784 Post Road, 
Pelham Manor 65, N. Y. 

Haskins, Asso. Prof. Elmer E., Ph.D. (Boston) Northeastern Univ., Boston, 53 Damien 
Rd., Wellesley Hills 82, Mass. 

Humes, Helen M., M.A. (Pittsburgh) Price Econ., Bureau of Labor Stat., U. S. Dept, of 
Labor, 3703 34th St., N.W., Washington 8, D. C. 

Jackson, Irwin E., M.A. Mec. Eng. (Pennsylvania) Lt., Cadet Ground School Inst., Box 
163, Tuskegee Army Air Field, Tuskegee, Ala. 

Jarrett, Rheem F., B.A. (Arizona) Lecturer in Psych., Dept. Psych., Univ. of Calif., 
Berkeley 4, Calif. 
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Johnsen, Madeline, A.M, (Stanford) tW iUh Ave.^ San Frunciaco^ Calif. 

King, Frederick G., B.A. (Harvard) Capt. tJ. S. Army, 1629 Que St., N.W., Washington, 
D. C. 

Leipnik, Hoy B., B.S. (Chicago) Rea. Aaat., Cowles Comm, for Res. in Eeoncnmcs, 8, 
Kenwood^ Chicago S7, III. 

Lesser, Grace L., B.A. (Hunter) Asst. Math., Applied Math. Group, Columbia tJniv., 
1576 Unionport Rd., Bronx 62^ N.Y. ' 

deLoor, Prof. Barend, Ph.D. (Amsterdam) Univ. of Pretoria, Pretoria Union of Sotith 
Africa. 

MacNeish, Harris F., Ph.D. (Chicago) Chairman Math. Dept., Brooklyn College, Bedford 
Ave. & Ave. H, Brooklyn, N. Y. 

Maddrill, James D., Ph.D. (California) Math. Res. and Dev. Ballistic Res. Lab., Aber> 
deen Proving Ground, Md. 

Madow, Lilian H., M.A. (American) 1^46 Ogden Street, N.W., Washington 10, D. C. 

Martian, Dixon M., M.A. (Coluihbia) Instr. Math., U. S. Military Academy, West Point, 
N. Y.* 

Martin, Charles C., B.S. (U. S. Military Academy) Lt., Ordnance Dept. U. S. Army, Box 
363, Hot Springs, New Mexico. 

Monderer, Phyllis, B.A. (Hunter) Asst. Math., Applied Math. Group, Columbia Univ. 
Div. War Res., New York, N. Y., 6iS9 West 170 Street, New York S3, N. Y. 

Moore, Margaret W., B.A. (Wilson) Stat., P-3, War Dept., LeHerKenny Ordnance Depot, 
Chambersburg, Pa., S04 Lincoln Way West, Chamherahurg, Pa. 

Pope, Otis, Ph.D. (Iowa State) Senior Biometrician, USDA, Tech. Collaboration Branch, 
Washington, D. C. 

Priestley, Alice E., M.A. (New York) Instr. Stat. and Math., Wilson College, Chambers* 
burg. Pa. 

Rafferty, J. Allan, B.S. (Harvard) Medical Student, Pfc., ASTP (AUS) Box 236, Rochester 
Med. School, Rochester 7, N. Y. 

Randall, Robert J., B.S. (Yale) Lt., Post Weight and Balance Officer, Tuskegee Army Air 
Field, Tuskegee, Ala. 

Reiner, Mae, B..\. (Hunter) Asst. Math., Applied Math. Group, Columbia Univ., Die. 
of War Res., 170 Second Avenue, New York 5, N. Y. 

Rodal, Prof. Juan A., Ph.D. (Buenos Aires) Univ. of Buenos Aires, Aviles 3756, Buenos 
Aires, Argentina. 

Rubin, Herman, S.M. (Chicago) 7143 East End Ave., Chicago Jfi, III. 

Schmalz, W. H., B.A. (Toronto) Tech. Supt., Merchants Factory Dominion Rubber Co., 
Kitchener, Ont., Canada. 

Simmons, Willard R., M.A, (Duke) Head of Stat. Section, Food and Automotive Ration* 
ing, Div., OPA, 1430 Saratoga Ave., N.E., Washington, D. C. 

Sobel, Milton, B.S. (C.C.N.Y.) 38 Elliot Place, The Bronx, N. Y. 

Stauber, B. R., M.A. (Minnesota) Chief, Relocation Planning Div., War Relocation 
Authority, U. S. Dept, of the Interior, 9701 Bexhill Drive, Kensington, Maryland. 

Steen, Jerome R., B.S. (Wisconsin) Mgr., Quality Control Eng., Sylvania Electric Prod¬ 
ucts, Inc., Emp>orium, Pa. 

Sullivan, John W., Sc.D. (Mass. Inst. Tech.) Metallurgist, American Iron and Steel In¬ 
stitute, 360 Fifth Ave., New York 1, N. Y. 

Trowbridge, Frederick, Quality Control Eng., Sentinel Radio Corp., 2020 Ridge Ave., Evan* 
ston, Ill. 

Week, Frank A., B.A. (Stanford) Capt., MAC, AUS, Chief, Stat. Analysis Branch, Med* 
ical Stat. Div., Office of the Surgeon General, 1818 H St., N.W., Washington 26, D. C. 

Weiss, Samuel, M.A. (Michigan) Chief, Manpower Estimates Section, War Manpower 
Comm., 3073 S. Buchanan, Arlington, Virginia. , 

Wold, Prof, Herman O., Ph.D. (Stockholm) Univ. of Uppsala, Stat. Inst., Odinslund 2, 
Uppsala, Sweden. 
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NEWS AND NOTICES 


Atmouncement of tlie St Louis Meeting of the Institute 

The Institute of Mathematical Statistics will hold a joint meeting with 
Section A (Mathematics) of the American Association for the Advancement of 
Science on Saturday, March 30 at 2 P.M. in St. Louis. All the details are 
not yet available but the session will feature (1) contributed papers on Statis¬ 
tics and Probability, (2) an address by Lt. Commander John H. Curtiss on the 
topic Stalistical Inference and its Engineering Applications, and (3) an address 
by Mr. Morris H. Haiwen on Sampling Problems in Svrveys of Business and 
Population. 


Meeting of Washington Chapter 

A joint regional meeting of the Washington Chapter of the Institute and the 
Washington Chapter of the American Statistical Association is being planned 
for April 12-13, 1946. 



MEMBERS OF THE INSTTrUTE OF MATHEMATICAL 
STATISTICS’*' 

(A« of November 16y 19^£) 

(The names of Fellows of the Institute are designated by * and Life Life Members byf) 

Abbey, Helen M.A. (Michigan) Stat., Bur. of Records and Stat., Mich. Dept, of Health, 
dlB N, Chestnuty Lansingy Mich. ’ 

Acerboni, Prof* Argentlno V. Dr. Ec. (Buenos Aires) Facultad de C. Economicas, Buenos 
Aires, Argentina, Larroque iSiy Banfieldy Argentina 
Acton, Forman Ch.E. (Princeton) T/4, Army of US, Corps of Engineers, S.E.D,, Bar¬ 
racks Area, Oak RidgCy Tenn. ^ 

Aitcbison, Beatrice Ph.D. (Johns Hopkins) Econ. and Stat. Analyst, Interst. Com¬ 
merce Comm., Washington 25, D. C. 1999 S St., N.W., Washington 9, D. C. 

Allen, Prof. Roy G. D.Sc. (London) London School of Econ., Houghton St., Aldwych, 
London, W.C. 2. 

Allendoerfer, Asso. Prof. Carl B. Ph.D. (Princeton) Haverford College, Haverford, Pa. 
Alt, Franz L. Ph.D. US Army, 971 Fort Washington Ave., New York City S9 
Alter, Dinsmore Ph.D. (California) Res. Asso. in Math. Theory of Stat., Calif. Inst, of 
Tech,, Dir. Griffith Observatory, Los Angeles, Calif., Col. T. C., US Army, 911 Pier 
Brooklyn Army Base, Brooklyn, N. Y. 

Anderson, Paul H. Ph.D. (Illinois) Econ. Analyst, Office of Surplus Property, Dept, of 
Commerce, Washington, D. C. 1998 Blair Mill Rd., Silver Spring, Md. 

Anderson, Asso. Prof. Richard L. Ph.D. (Iowa State) Res. Math., Inst, of Stat., N. C. 
State College, Raleigh, N. C. 

Anderson, Theodore W., Jr. Ph.D. (Princeton) Res. Math., Cowles Commission for 
Res. in Econ., Univ. of Chicago, Chicago 37, Ill. 

Andrews, Asst. Prof. T. Gaylord Ph.D. (Nebraska) Univ. of Chicago, Chicago, Bl. 
Angell, Dorothy T. Stat. Analyst, Bell Tel. Labs., Murray Hill, N. J. 

Arias, B., Jorge C.E. (Guatemala) 3 Avenida Sur 65, Guatemala City, Guatemala, 
Central America 

Arnold, Asso. Prof. Herbert £. Ph.D. (Yale) Wesleyan Univ., Middletown, Conn. 
Arnold, Asst. Prof. Kenneth J. Ph.D. (Mass. Inst. Tech.) Univ. of Wisconsin, Madison 
6, Wis. North Hall 

Aroian, Leo A. Ph.D. (Michigan) Instr. Hunter Coll., New York City. 947 Wadsworth 
Ave.y New York City SS 

Arrow, Kenneth J. M.A. (Columbia) Lydig Fellow, Columbia Univ., 116th St. and 
Broadway, New York City, Capt. AC, Hq. AAF, Weather Service, Asheville, N. C. 
918 South French Broad Avenue, Asheville 

* Members were asked to supply fresh information for this Directory. Records may be 
inexact or incomplete (1) because of the failure of some member to comply with this request, 

(2) because the directory card became obsolete as a result of an unreported change of address, 

(3) because information about position did not accompany a notice of change of address, or 

(4) because it is impossible to give all the information about men on leave in the standard 
form of ^^position,” '^address,and (in italics) ^ffiome or mail address.’’ Some members 
on leave or in the services have reported the permanent address. Some have reported the 
*^on leave” or ^‘APO” address, as the mailing address. The addresses given are the last 
reported addresses. When an address is known to be in error, it is followed by (last address). 
Changes in addresses or errors in names, titles or addresses, should be reported to 
the Secretary, 
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Aftrachftn, Asbo. Prof. Max Ph.D. (Brown) Antioch College, Yellow Springs, Ohio 

Aimer, George B.A. (Western Reserve) IMJ^lS'Iowa Ave., Cleveland Ohio 

Bachelor, Robert W. M.B.A. (Washington) American Bankers Assn., 22 East 40th St., 
New York City 16 

Bacon, Abbo. Prof. Harold M. Ph.D. (Stanford) Stanford Univ., Stanford, Calif. Box 
1114 

Bailey, Arthur L. B.S. (Michigan) Stat., American Mutual Alliance, 60 E. 42nd St., 
New York City, F. 0. Box F75, Ramsey, N. J. 

*Baker, Asst. Prof. George A. Ph D. (Illinois) Asst. Prof, of Math, and Asst. Stat., 
Exp. Sta., Coll, of Agri., Univ. of California, Davis, Calif. 

Baldwin, Woodson W. S.B. (Mass. Inst. Tech.) Capt., Ord. Dept., USA Office of Field 
Dir. of Ammunition Plants, 3629 Lindeli Blvd., St. Louis 8, Mo. S745 Lindell Blvd. 

Bales, R. P. B.A.Sc. (Toronto) Tech. Supt., Dominion Rubber Co., St. Jerome, Que. 
Canada 

Bancroft, Asst. Prof. Theodore A. Ph.D. (Iowa State) Iowa State Coll., Math. Dept., 
Ames, Iowa 

Bafios, Olegario Fernandez D.C. (Madrid) Catedratico, Univ. of Madrid, Calle Lopez- 
delloyos 7, Spain 

Barkan, He];)>ert M.A. (Columbia) Econ. Analyst, 60 8th Ave., Brooklyn, N. Y. 

Barnes, Jarvis M.A. (George Peabody Coll, for Teachers) Atlanta Board of Educ. 
14th Floor, City Hall, Atlanta, Ga. 

Barnes, Prof. John L. Ph.D. (Princeton) Chairman, Dept, of Applied Math., Tufts 
Coll., Medford 55, Mass., 16 Ardley Road, Winchester 

Barr, Prof. Arvil S. Ph.D. (Wisconsin) Univ. of Wisconsin, Madison, Wis. 

Barral-Souto, Prof. Jos5 Sc.D. (Buenos Aires) Univ. of Buenos Aires, Buenos Aires, 
Argentina, Cordoba 1459 

*Bartky, Asso. Dean Walter Ph.D. (Chicago) Univ. of Chicago, Chicago, Ill. 

Bartlett, Maurice D.Sc. (Ix)ndon) Univ. Lecturer, Cambridge, 1S7 Chesterton Road, 
Cambridge, Eng. 

Bassford, Horace R. B.A. (Trinity) Vice Pres, and Actuary, Metropolitan Life Ins. Co., 
1 Madison Ave., New York City 10 

*Baten, Prof. William D. Ph.D. (Michigan) Pn)f. of Math. Mich. State Coll, and Res. 
Prof. Mich. Agri. Exp. Sta., Mich. State Coll., E. l^ansing, Mich. 411 Marshall St. 

Bates, Prof. O, Kenneth Sc.D. (Mass. Inst. Tech.) Prof, of Math, and Head of Dept., 
The St. Lawrence Univ., Canton, N. Y. 

Battin, Asst. Prof. Isaac L. A.M. (Swarthmore) Drew Univ., Madison, N. J. 14 Glen- 
wild Rd. 

Beall, Geoffrey Ph.D. (London) Res. Asso., Inst, of Pa|)er Chemistry, Appleton, Wis. 

Bechhofer, Robert E. B.A. (Columbia) Stat., The Kellex Corp., 233 Broadway, New 
York City. 181 Degraw Avenue, Teaneck, N. J. 

Becker, Harold W. Elec. Inst., Mare Is. Training School, Bldg. 146, Mare Island, Calif. 
14B6 Amador, Vallejo 

Beckstead, Gordon L. Lt.(j.g.) USNR, Weather Central NAS, San Diego, Calif. 

Beebe, Gilbert W. Ph.D. (Columbia) Lt. Sn C., AU8 CJontrol Div., Office of the Sur¬ 
geon General, 1818 H St., N.W., Washington, D. C. 

Been, Richard O. M.A. (George Washington) Sr. Agri. Econ., US Bur. of Agri. Econ., 
3433 South Bldg., Washington, D. C. 

Bellleon, Harold R. S.M. (Mass. Inst. Tech.) Industrial Eng., War Dept., Ord. Dept., 
Pentagon Bldg., Arlington, Va. S416 B St., S.E., Washington 19, D. C. 

Belz, Asso. Prof. Maurice H. M.A. (Melbourne) Univ. of Melbourne, Carlton, N. 3, 
Victoria, Australia 

Bennett, Prof. Albert A. Ph.D. (Princeton) Brown Univ., Providence, R. I. 
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Bennett, Bltlr M. M.A. (CJolumbia) Abso. Math* Nat. Bur. of Btandardei Waehtngtoni 
D. C. mOM8t„N,W, 

Bennett, Carl A. M.A. (Michigan) Chem., Slinton Engineer Worjjos, Tenn. Eaatman 
Corp., Knoxville 6, Tenn. F57 Weei Tenn,, Oak Ridge 

Berger, Richard M.A. (Columbia) Asbo. Stat., Office of Price Admin., Wai^ngton, 
D. C. Lt.(j.g,) USNR Communication Officer, USS Gainard (00706) c/o FPO 
Francisco, ^lif. 26 Rugby Road, Rockville Centre, N, Y, 

Berkaon, Joseph D.Sc. (Johns Hopkins) Col. M.C., US Army, AAP, Office of the Air 
Surgeon, Washington 26, D. C. 

Berman, Abraham J. M.A. (Brooklyn) Stat., N. Y. State Dept, of Labor, 80 Center St., 
New York City, IJ^O College Ave., Bronx, N. Y, 

Berwick, Leo A.B. (New York) Capt., A.C. Asst, to Surgeon Stat. Unit of Psych. 
Sect., Hdq. AFTRC, T & P Bldg., Fort Worth 2, Texius 

Bickerstaff, Asst. Prof. Thomas A. M.A. (Mississippi) Univ. of Mississippi, State 
College, Miss. 

Bigelow, Julian H. W W. 118th St., New York City 27 

Bimbaum, Asst. Prof. Z. William Ph.D. (Lwow) Univ. of Washington, Seattle, Wash. 

Blackadar, Walter L. B.A. (McMaster) Asso. Actuary, Equitable Life Assurance So> 
ciety of the US, 893 7th Ave., New York City 1 

Blackburn, Asso. Prof. Raymond F. Ph.D. (Pittsburgh) Head, Dept, of Stat., Univ. of 
Pittsburgh, Pittsburgh 13, Pa. 

Blackwell, Asst. Prof. David Ph.D. (Illinois) Math. Dept. Howard Univ., Washington, 
D. C. 

Blake, Archie Ph.D. (Chicago) Ballistic Res. Lab., Aberdeen Proving Gd. Box 8$, 
Aberdeen, Md. 

Blanche, Ernest E. Ph.D. (Illinois) Foreign Econ. Admin., 515-22nd St., N.W., Wash¬ 
ington, D. C., 9409 Montgomery Ave., N. Chevy Chase, Md., APO 24741 c/o Postmas¬ 
ter, New York City 

*Bli8S, Asso. Prof. Chester I. Ph.D. (Columbia) Biometrician, Conn. Agri. Exp. Sta., 
Lecturer in Biometry, Yale Univ., New Haven, Conn. 

Bloom, Rose B.A. (Hunter) 1276 SCSU, Fort Jay, N. Y. 

Bloom, Royal F. M.A. (Minnesota) Lt. Comdr., 4717 Arlington Annex Navy Dept., 
Washington, D. C. 

Blommers, Paul J. Ph.D. (Iowa) Univ. Examiner and Registrar, 114 Univ. Hall, State 
Univ. of Iowa, Iowa City, Iowa 

Boddie, John B., Jr. Chief: Budget Formulation, Foreign Ecoh. Admin., Washington, 
D. C. 2628 Tunlaw Rd., N.W. 

Bonis, Austin J. B.S. (C.C.N.Y.) Major, G-I War Dept. Gen. Staff, Washington, D. C. 
2500 Que St., N.W. 

Bonnar, Robert U. M.S. (Washington) 219 Jefferson St., Vallejo, Calif. 

Boozer, Mary £. A.M. (Chicago) Stat. Res., Virginia State Planning Dc., 301 Finance 
Bldg., Richmond 19, Va. 

Borland, James M.A. (Indiana) Capt., Ex. Officer, Inspection Office, Pine Bluff Ar¬ 
senal, Ark. 

fBowen, Earl K. M.A. (Boston) Instr. in Math., Northeastern Univ., 360 Huntington 
Ave., Boston, Mass. 216 Union St., Norwood 

Boschan, Paul Ph.D. (Vienna) Econ. Inst., 600 Fifth Ave., New York City. W4 W, 
40th St., New York City 18 

Bower, Oliver K. Ph.D. (Illinois) Associate, Univ. of Rl., Urbana, Ill. 606 W, John 
Champaign 

fBowker, Albert H. S.B. (Mass. Inst. Tech.) Student, Coliunbia Univ., New York City 
27, 22 Arden Place, Yonkere 8, N. Y. 
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Brady, Porotliy S. Ph,D. (California) Home Ec. Specialist, Bur. of Home Econ., Wash¬ 
ington, D. C. 6S1S Fulton St., N.W. 

Brandt, Alva E. Ph.D. (Iowa State) Chief, Erosion Control Practices Div., Soil Conser¬ 
vation Service, USDA, US Dept, of Agri., Washington, D. C., Box ISS, Route 5, 
Vienna, Va. 

Brearty, Charles R, B.S. (California) Major, US Army, Signal Corp. Inspection Agency, 
12th Floor, Public Ledger Bldg., 8th and Chestnut Sts., Philadelphia 6, Pa. 

Breden, Robert E. B.S. (Kansas) Personnel Tech., Personnel Res. Dept., The Proctor 
& Gamble Co., 6th and Main Sts., Cincinnati, Ohio 

Bridger, Clyde A. M.S. (Oregon) Inst. Math., Univ. of Utah, Salt Lake City 1, Utah, 
8S6 Douglas Street, Salt Lake City 8 

Brier, Glenn A.M. (George Washington) Meteorologist, US Weather Bur., Washington 
25, D. C. 

Brizey, Asso. Prof. John C. Ph.D. (Chicago) Univ. of Oklahoma, Norman, Okla. 987 
S. Pickard St., Norman 

Brizey, Nancy A.B. (Vassar) Economists, Davis and Gilbert, la-w Firm, 1 E. 44 St. 
70 East 77 St., New York City 81 

Bronfenbrenner, Martin Ph.D. (Chicago) Lt. (j.g.) USNR, Office of CINCPAC, c/o 
Postmaster, San Francisco, Calif. 788 N. First Ave., Tucson, Ariz. 

Brookner, Ralph J. Ph.D. (Columbia) Lt., USNR Navy Board, Washington, D. C. 
90 Riverside Drive, L., New York City 84 

Brooks, Alvin G. B.A. (Ripon) Chief of Inspection Tasks Sect., Western Electric Co., 
Hawthorne Sta., Chicago, Ill. 4^38 Lawn Ave., Western Springs 

Brown, Asst. Prof. Arthur B. Ph.D. (Harvard) Queens Coll., Flushing, N. Y. 

Brown, Arthur W. A.B. (Princeton) Res. Asso., Columbia Univ., Div. of War Res., 
New York City, Columbia Res. Group M, Room 4311, COMINCH, Navy Dept., 
Washington, D. C. 

Brown, George W. Ph.D. (Princeton) Res. Eng., RCA Labs., Princeton, N. J. 

fBrown, Richard H. A.B. (Columbia) Asso. Math., Navy Dept., Bur. Ord., Washington, 
D. C. Rm 316-1 3415 S8th St., Washington 16, D. C. 

Brown, Prof. Theo. H. Ph.D. (Yale) Bus. Stat. Harvard Bus. School, Soldier's Field, 
Boston 63, Mass. 

Brumbaugh, Prof. Martin A. Ph.D. (Pennsylvania) Univ. of Buffalo, Crosby Hall, 
Buffalo 14, N. Y. 

Bruner, Nancy M.A. (Iowa) Stat., Western Auto Supply Company, Kansas City 8, 
Mo. 7611 Main Street, Kansas City 5 

Bruyere, Martha M.D. (Chicago) Stat. US Public Health Service, Bldg. Tg, Bethesda, 
Md. R.F.D, Route fUl, Gaithersburg 

Bruyere, Paul T. M.P.H. (Yale) Stat. US Public Health Service, Bldg. T6, Bethesda, 
Md. R.F.D. Route fil, Gaithersburg 

Bryan, Joseph Ed. M. (Harvard) Mass. Inst, of Tech., Cambridge, Mass. Apt. 603, 
1010 86th St., N.W., Washington, D. C. 

Budne, Thomas A. M.A. (N. J. State Teachers Coll.) Inst, of Math., N. J, State Teach¬ 
ers Coll., Upper Montclair, N. J. 8038 76th St., Brooklyn 14, N. Y. 

Bunke, Alfred M.A. (Columbia) Sen. Stat., N. Y. State Dept, of Labor, 37 Parkwood 
St., Albany 3, N. Y. 

Burgess, Robert W. Ph.D. (Cornell) Chief Econ., Western Electric Co., 195 Broadway, 
New York City 7 

Burlngton, Asso. Prof. Richard S. Ph.D. (Ohio) On leave from Case School of Applied 
Science, Cleveland, Ohio, Head Math., Bur. ord. USN, 5800 N. Carlin Spring Rd., 
Arlington, Va. 

Burk, Marjorie B.A. (Hunter) Stat., Weather Service, Hdq. AAF, Washington, D. C. 
1918 Third Street, N.E. 
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Buro», Oscar K. M.A. (Columbia) Major, Signal Corps, Chief. Standards Beet., Sobool 
Div., AAF, Washington, D. C. SOI 8. Courthowe Rd.f Arlington^, Va, 

Burr, Asso* Prof. Irving W. Ph.D. (Michigan) Purdue Univ., W. Liiayettei Ind. 

Bttshey, Asso. Prof. J. Hobart Ph.D. (Michigan) Hunter Coll., 695 Park Av9e«» New 
York City 21 

*Camp, Prof. Burton H. Ph.D. (Yale) Wesleyan Univ., Middletown, Conn. HO Jf<. 
Vernon St. 

Campbell, Asso. Prof. Frances L. Ph.D. (Michigan) Geo. Pepperdine Coll., 1121 W. 
79th St.y Los Angeles, Calif. 

Campbell, George C. M.S. (Iowa) Supervisor, Metropolitan Life Ins. Co., 1 Madison 
Ave., New York City 10. Troy Road R.F.D. $1, Boonton, N. J. 

Campbell, James Ph.D. (Edinbuzgh) Univ. Math. Lecturer, Victoria Univ. Coll. 
Well., W.I. New Zealand 

Canter, Stanley D. B.S. (C.C.N.Y.) Analyst, The Econ. Inst., 600 Fifth Ave., New 
York City, 2676 Aform Ave., The Bronx 56 

Caplan, Benjamin Ph.D. (Chicago) Econ., OPA, 28S1 28th St., N.W., Washington 8, 
D. C. 

Cap6, Bernardo G. Ph.D. (Cornell) Biometrician and Head of Dept, of Agronomy and 
Horticulture, Agr. Exp. Sta., Rio Piedras, Puerto Rico, S2\ Rosario St., Santuree, 
Piterio Rico 

Carlson, John L. M.A. (Stanford) Lt. Comdr. USNR, Dept., 2B-Navy 3237-FPO-San 
Francisco, Calif. 

Carlton, A. George B.A. (Gustavus Adolphus) 2.52 W. 102nd St., New York City 25 

Carter, Gerald C. Ph.D. (Purdue) Supervisor of Training and Activities, Univ. of HI., 
Urbana, Ill. 

t*Carver, Prof. H. C. Ph.D. (Michigan) Dept, of Math., Univ. of Michigan, Ann Arbor, 
Mich. 

Casanova, Teobaldo Ph.D. (New York) Res. Stat., Inst, of LegaLSoc. Res., Univ. of 
Puerto Rico, Rio Piedras, Puerto Rico 

Cederberg, Prof. William £. Ph.D. (Wisconsin) Augustana Coll., Rock Island, lU. 
2642 22h Ave. 

Chances, Ralph B.B.S. (C.C.N.Y.) 46 W. 83rd St., New York City 

Chang, Calvin C. M.A. (Michigan) Public Accountant, 132 W. First St., Los Angeles, 
Calif. 

Chapman, Roy A. B.S. (Minnesota) Silviculturist, Southern Forest Exp. Sta., New Or< 
leans, La. Hitchiti Experimental Forest, Round Oak 

Chassan, Jack B.S. (C.C.N.Y.) Stat., Office of Stat. Control, Hdq. AAF, 3013 30th 
St., S.E., Washington, D. C. 

Chen, Way Ming Ph.D. (California) 21 Mosswood Road, Berkeley 4, Calif. 

Christopher, Edward £. B.S. (Mass. Inst. Tech.) Res. Analyst, War Dept., Washing¬ 
ton, D. C. 6704 North 26th Street, Arlington, Va. 

Churchill, Edmund A.M. (Columbia) Rutgers Univ., New Brunswick, N. J. 

Churchman, C. West Ph.D. (Pennsylvania) Math. Frankford Arsenal, Ord. Lab., 
Philadelphia 37, Pa. 

Clark, Asso. Prof .A. G. A.M. (Colorado) Colorado State Coll, of A. and M., Fort Collins, 
Colo. 

Clarkson, Asso. Prof. John M. Ph.D. (dlornell) North Carolina State Coll., Raleigh, 
N. C. 

Clifford, Asst. Prof. Paul C. M.A. (Columbia) State Teachers Coll., Montclair, N. J., 
Stat. Consultant, Wright Aeronautical Corp., Paterson, N. J. 641 Upper Mountain 
Ave., Upper Montclair 

Cllnedinst, William O. B.S. (Carnegie Inst. Tech.) Eng., National Tube Co., Friok 
Bldg., Pittsburgh, Pa. 
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Cloudman, Charles G. M.Sc. (Rhode Island State Coll.) Valuation Eng., Ebasco Serv¬ 
ices, Inc., 2 Rector St., New York City, 25 Clark St., Brooklyn, N. Y. 

Cobb, William J. Stat. Census Bureau, 4036 8th St., N.E., Washin^n, D. C. 

*Cochran, Prof. William G. M.A. (Cambridge) Iowa State Coll., Ames, Iowa 

Cody, Bianca M.A. (Columbia) Stat., US Rubber Co., 1230 Sixth Avenue, New York 
City. S265 Bainhridge Ave., New York City 67 

Cody, Donald D. A.B. (Harvard) Equit. Life Ass. Soc., 303 Seventh Ave., New York 
City 

Coggins, Paul P. A.M. (Harvard) Supervising Accountant, Amer. Tel. and Tel. Co., 
195 Broadway, New York City 

Cohen, Alonzo C. Ph.D. (Michigan) Lt. Col., Army Univ. Study Center Hi2, APO 772 
c/o postmaster, New York City 

Cohen, Jozef B. Ph.D. (Cornell) Inst., Dept, of Psych., Cornell Univ., Ithaca. N. Y. 

Cohen, Karl Ph.D. (Columbia) Esso Labs. Res. Div., P.O. Box 243, Elizabeth, N. J. 

Coleman, Asst. Prof. Edward P. M.S. (Iowa) Univ. of Omaha, Omaha, Nebraska, Inst, 
in Math., US Military Academy, West Point, N. Y. 

Cooper, William W. A.B. (Chicago) Inst, in Econ., Univ. of Chicago, Chicago 37, Ill. 
6669 8. Ellis Ave. 

Cope, Asso. Prof. T. Freeman Ph.D. (Chicago) Queens Coll., Flushing, N. Y. 

*Copeland, Prof. Arthur H. Ph.D. (Harvard) Univ. of Michigan, Ann Arbor, Mich. 
616 Oswego St. 

Cornell, F. G. Ph.D. (Columbia) Chief, Stat. Res. Service, US Office of Ed., Tempo. M. 
26th and Water Sts., N.W., Washington, D. C. 

Cornfield, Jerome B.S. (New York) Stat. Dept, of Labor, R.F.D. Hi2, Herndon, Va. 

Cottezman, Charles W. Ph.D. (Okio State) Res. Asso., Lab. of Vertebrate Biology, 
Univ. of Michigan. On leave for army service. 6S7 Hawthorne Rd., Winston-Salem, 
N. C. 

Court, Louis M. Ph.D. (Columbia) 141 Broadway, New York City 6 

Cowan, Donald R. G. Ph.D. (Minnesota) Donald R. G. Cowan and Associates, 1216 
Citizens Bldg., Cleveland 14, Ohio 

Cowden, Prof. Dudley J. Ph.D. (Columbia) Econ. Stat., Univ. of N. C., Chapel Hill, 
N. C. Box 515 

Cox, Gerald Ph.D. (Illinois) Res. Chem., Corn Prod. Refining Co., Argo, Ill. 200 South 
7th Ave., La Grange 
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Venezuela (1) 

Caracas. Michalup. 
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